Boosting Gaussian mixture models via discriminant analysis

The Gaussian mixture model (GMM) can approximate arbitrary probability distributions, which makes it a powerful tool for feature representation and classification. However, it suffers from insufficient training data, especially when the feature space is of high dimensionality. In this paper, we present a novel approach to boost the GMMs via discriminant analysis in which the required amount of training data depends only upon the number of classes, regardless of the feature dimension. We demonstrate the effectiveness of the proposed BoostGMM-DA classifier by applying it to the problem of emotion recognition in speech. Our experiment results indicate that significantly higher recognition rates are achieved by the BoostGMM-DA classifier than are achieved by the conventional GMM minimum error rate (MER) classifier under the same training conditions, and that significantly less training data are required for the BoostGMM-DA classifier to yield comparable recognition rates to the GMM MER classifier.

[1]  Hayit Greenspan,et al.  Constrained Gaussian mixture model framework for automatic segmentation of MR brain images , 2006, IEEE Transactions on Medical Imaging.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Douglas A. Reynolds,et al.  Speaker identification and verification using Gaussian mixture speaker models , 1995, Speech Commun..

[6]  Haim H. Permuter,et al.  Gaussian mixture models of texture and colour for image database retrieval , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[8]  Douglas A. Reynolds,et al.  Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[9]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Gilles Celeux,et al.  A Component-Wise EM Algorithm for Mixtures , 2001, 1201.5913.

[11]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[12]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[13]  Yihong Gong,et al.  Improving Speaker Diarization by Cross EM Refinement , 2006, 2006 IEEE International Conference on Multimedia and Expo.