Accelerated greedy mixture learning

Mixture probability densities are popular models that are used in several data mining and machine learning applications, e.g., clustering. A standard algorithm for learning such models from data is the Expectation-Maximization (EM) algorithm. However, EM can be slow with large datasets, and therefore approximation techniques are needed. In this paper we propose a variational approximation to the greedy EM algorithm which oers speedups that are at least linear in the number of data points. Moreover, by strictly increasing a lower bound on the data log-likelihood in every learning step, our algorithm guarantees convergence. We demonstrate the proposed algorithm on a synthetic experiment where satisfactory results are obtained.

[1]  Jon Louis Bentley,et al.  Multidimensional binary search trees used for associative searching , 1975, CACM.

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  B. Lindsay The Geometry of Mixture Likelihoods: A General Theory , 1983 .

[4]  Sanjay Ranka,et al.  An effic ient k-means clustering algorithm , 1997 .

[5]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[6]  Andrew W. Moore,et al.  Very Fast EM-Based Mixture Model Clustering Using Multiresolution Kd-Trees , 1998, NIPS.

[7]  Sanjoy Dasgupta,et al.  Learning mixtures of Gaussians , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[8]  Andrew R. Barron,et al.  Mixture Density Estimation , 1999, NIPS.

[9]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Ben J. A. Kröse,et al.  Efficient Greedy Learning of Gaussian Mixture Models , 2003, Neural Computation.

[11]  J. R. J. Nunnink,et al.  Large Scale Gaussian Mixture Modelling using a Greedy Expectation-Maximisation Algorithm , 2003 .

[12]  Nikos A. Vlassis,et al.  A Greedy EM Algorithm for Gaussian Mixture Learning , 2002, Neural Processing Letters.

[13]  Robert F. Sproull,et al.  Refinements to nearest-neighbor searching ink-dimensional trees , 1991, Algorithmica.