Audio-visual emotion recognition with multilayer boosted HMM

Emotion recognition has become an important task of modern human-computer interaction. A multilayer boosted HMM ( MBHMM) classifier for automatic audio-visual emotion recognition is presented in this paper. A modified Baum-Welch algorithm is proposed for component HMM learning and adaptive boosting ( AdaBoost) is used to train ensemble classifiers for different layers ( cues) . Except for the first layer,the initial weights of training samples in current layer are decided by recognition results of the ensemble classifier in the upper layer. Thus the training procedure using current cue can focus more on the difficult samples according to the previous cue. Our MBHMM classifier is combined by these ensemble classifiers and takes advantage of the complementary information from multiple cues and modalities. Experimental results on audio-visual emotion data collected in Wizard of Oz scenarios and labeled under two types of emotion category sets demonstrate that our approach is effective and promising.