A New Implementation of k-MLE for Mixture Modeling of Wishart Distributions

We describe an original implementation of k-Maximum Likelihood Estimator (k-MLE)[1], a fast algorithm for learning finite statistical mixtures of exponential families. Our version converges to a local maximum of the complete likelihood while guaranteeing not to have empty clusters. To initialize k-MLE, we propose a careful and greedy strategy inspired by k-means++ which selects automatically cluster centers and their number. The paper gives all details for using k-MLE with mixtures of Wishart (WMMs). Finally, we propose to use the Cauchy-Schwartz divergence as a comparison measure between two WMMs and give a general methodology for building a motion retrieval system.

[1]  Inderjit S. Dhillon,et al.  Clustering with Bregman Divergences , 2005, J. Mach. Learn. Res..

[2]  J. Wishart THE GENERALISED PRODUCT MOMENT DISTRIBUTION IN SAMPLES FROM A NORMAL MULTIVARIATE POPULATION , 1928 .

[3]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[4]  G. McLachlan,et al.  The EM Algorithm and Extensions: Second Edition , 2008 .

[5]  Sullivan Hidot,et al.  An Expectation-Maximization algorithm for the Wishart mixture model: Application to movement clustering , 2010, Pattern Recognit. Lett..

[6]  Michael I. Jordan,et al.  Revisiting k-means: New Algorithms via Bayesian Nonparametrics , 2011, ICML.

[7]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[8]  Jean-Paul Chilès,et al.  Wiley Series in Probability and Statistics , 2012 .

[9]  Frank Nielsen,et al.  Statistical exponential families: A digest with flash cards , 2009, ArXiv.

[10]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[11]  Matus Telgarsky,et al.  Hartigan's Method: k-means Clustering without Voronoi , 2010, AISTATS.

[12]  Frank Nielsen,et al.  Closed-form information-theoretic divergences for statistical mixtures , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[13]  Frank Nielsen,et al.  K-MLE: A fast algorithm for learning statistical mixture models , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[14]  Lawrence Carin,et al.  Variational Bayes for continuous hidden Markov models and its application to active learning , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Frank Nielsen,et al.  PyMEF — A framework for exponential families in Python , 2011, 2011 IEEE Statistical Signal Processing Workshop (SSP).