A kurtosis-based dynamic approach to Gaussian mixture modeling

We address the problem of probability density function estimation using a Gaussian mixture model updated with the expectation-maximization (EM) algorithm. To deal with the case of an unknown number of mixing kernels, we define a new measure for Gaussian mixtures, called total kurtosis, which is based on the weighted sample kurtoses of the kernels. This measure provides an indication of how well the Gaussian mixture fits the data. Then we propose a new dynamic algorithm for Gaussian mixture density estimation which monitors the total kurtosis at each step of the EM algorithm in order to decide dynamically on the correct number of kernels and possibly escape from local maxima. We show the potential of our technique in approximating unknown densities through a series of examples with several density estimation problems.

[1]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[4]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[5]  G. McLachlan On Bootstrapping the Likelihood Ratio Test Statistic for the Number of Components in a Normal Mixture , 1987 .

[6]  H. J. Jeffrey Chaos game representation of gene structure. , 1990, Nucleic acids research.

[7]  Donald F. Specht,et al.  Probabilistic neural networks , 1990, Neural Networks.

[8]  Hans G. C. Tråvén,et al.  A neural network approach to statistical pattern classification by 'semiparametric' estimation of probability density functions , 1991, IEEE Trans. Neural Networks.

[9]  S. Ingrassia A comparison between the simulated annealing and the EM algorithms in normal mixture decompositions , 1992 .

[10]  William H. Press,et al.  Numerical Recipes in C, 2nd Edition , 1992 .

[11]  Skolnick,et al.  Global fractal dimension of human DNA sequences treated as pseudorandom walks. , 1992, Physical review. A, Atomic, molecular, and optical physics.

[12]  Dana Ron,et al.  The Power of Amnesia , 1993, NIPS.

[13]  Karel Culik,et al.  Affine automata and related techniques for generation of complex images , 1993, Theor. Comput. Sci..

[14]  Karel Culik,et al.  Rational and Affine Expressions for Image Description , 1993, Discret. Appl. Math..

[15]  J. Oliver,et al.  Entropic profiles of DNA sequences through chaos-game-derived images. , 1993, Journal of theoretical biology.

[16]  A. Fiser,et al.  Chaos game representation of protein structures. , 1994, Journal of molecular graphics.

[17]  Ramón Román-Roldán,et al.  Entropic feature for sequence pattern through iterated function systems , 1994, Pattern Recognit. Lett..

[18]  Roy L. Streit,et al.  Maximum likelihood training of probabilistic neural networks , 1994, IEEE Trans. Neural Networks.

[19]  Sukhan Lee,et al.  Self-organizing neural networks based on gaussian mixture model for pdf estimation and pattern classification , 1994 .

[20]  R. Mantegna,et al.  Statistical mechanics in biology: how ubiquitous are long-range correlations? , 1994, Physica A.

[21]  B. Lindsay,et al.  Testing for the number of components in a mixture of normal distributions using moment estimators , 1994 .

[22]  H. Weiss,et al.  On the dimension of deterministic and random Cantor-like sets, symbolic dynamics, and the Eckmann-Ruelle Conjecture , 1996 .

[23]  Yoshua Bengio,et al.  Pattern Recognition and Neural Networks , 1995 .

[24]  Meir Feder,et al.  A universal finite memory source , 1995, IEEE Trans. Inf. Theory.

[25]  Y. Peres,et al.  Measures of full dimension on affine-invariant sets , 1996, Ergodic Theory and Dynamical Systems.

[26]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[27]  George K. Papakonstantinou,et al.  The Probabilistic Growing Cell Structures Algorithm , 1997, ICANN.

[28]  L. Barreira,et al.  On a general concept of multifractality: Multifractal spectra for dimensions, entropies, and Lyapunov exponents. Multifractal rigidity. , 1997, Chaos.

[29]  H. Weiss,et al.  A multifractal analysis of equilibrium measures for conformal expanding maps and Moran-like geometric constructions , 1997 .

[30]  S. Basu,et al.  Chaos game representation of proteins. , 1997, Journal of molecular graphics & modelling.

[31]  Wentian Li,et al.  The Study of Correlation Structures of DNA Sequences: A Critical Review , 1997, Comput. Chem..

[32]  P. Tiňo,et al.  Constructing finite-context sources from fractal representations of symbolic sequences , 1998 .

[33]  Peter Tiño,et al.  Extracting finite-state representations from recurrent neural networks trained on chaotic symbolic sequences , 1999, IEEE Trans. Neural Networks.

[34]  George K. Papakonstantinou,et al.  Mixture density estimation based on Maximum Likelihood and test statistics , 1999 .