Ensemble Gaussian mixture models for probability density estimation

Estimation of probability density functions (PDF) is a fundamental concept in statistics. This paper proposes an ensemble learning approach for density estimation using Gaussian mixture models (GMM). Ensemble learning is closely related to model averaging: While the standard model selection method determines the most suitable single GMM, the ensemble approach uses a subset of GMM which are combined in order to improve precision and stability of the estimated probability density function. The ensemble GMM is theoretically investigated and also numerical experiments were conducted to demonstrate benefits from the model. The results of these evaluations show promising results for classifications and the approximation of non-Gaussian PDF.

[1]  B. Silverman Density estimation for statistics and data analysis , 1986 .

[2]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[3]  David W. Scott,et al.  Multivariate Density Estimation: Theory, Practice, and Visualization , 1992, Wiley Series in Probability and Statistics.

[4]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[5]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[6]  Natalia M. Markovich,et al.  Estimation of heavy-tailed probability density function with application to Web data , 2004, Comput. Stat..

[7]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[8]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[9]  Jenq-Neng Hwang,et al.  Nonparametric multivariate density estimation: a comparative study , 1994, IEEE Trans. Signal Process..

[10]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[11]  Günther Palm,et al.  Multi-objective selection for collecting cluster alternatives , 2011, Comput. Stat..

[12]  J. Friedman,et al.  PROJECTION PURSUIT DENSITY ESTIMATION , 1984 .

[13]  Volker Tresp,et al.  Averaging, maximum penalized likelihood and Bayesian estimation for improving Gaussian mixture probability density estimates , 1998, IEEE Trans. Neural Networks.

[14]  Keinosuke Fukunaga,et al.  Introduction to Statistical Pattern Recognition , 1972 .

[15]  M. C. Jones,et al.  A Brief Survey of Bandwidth Selection for Density Estimation , 1996 .

[16]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[17]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[18]  Tatsuya Kawahara,et al.  GMM and HMM training by aggregated EM algorithm with increased ensemble sizes for robust parameter estimation , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[19]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[20]  Biing-Hwang Juang,et al.  Fundamentals of speech recognition , 1993, Prentice Hall signal processing series.

[21]  Mira Park,et al.  A bias reducing technique in kernel distribution function estimation , 2006, Comput. Stat..

[22]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.