On‐line expectation–maximization algorithm for latent data models

In this contribution, we propose a generic online (also sometimes called adaptive or recursive) version of the Expectation-Maximisation (EM) algorithm applicable to latent variable models of independent observations. Compared to the algorithm of Titterington (1984), this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e., that of the maximum likelihood estimator. In addition, the proposed approach is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model.

[1]  D. Ruppert,et al.  Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .

[2]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[3]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[4]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[5]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[6]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[7]  Shaojun Wang,et al.  Almost sure convergence of Titterington's recursive estimator for mixture models , 2002, Proceedings IEEE International Symposium on Information Theory,.

[8]  Carlos S. Kubrusly,et al.  Stochastic approximation algorithms and applications , 1973, CDC 1973.

[9]  C. Robert,et al.  Estimating Mixtures of Regressions , 2003 .

[10]  Eric Moulines,et al.  Recursive Em Algorithm with Applications to Doa Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[11]  M. Pelletier Weak convergence rates for stochastic approximation with application to multiple targets and simulated annealing , 1998 .

[12]  Pei Jung Chung,et al.  Recursive EM and SAGE-inspired algorithms with application to DOA estimation , 2005, IEEE Transactions on Signal Processing.

[13]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[14]  Jalal Almhana,et al.  Online EM algorithm for mixture with application to internet traffic modeling , 2004 .

[15]  Shin Ishii,et al.  On-line EM Algorithm for the Normalized Gaussian Network , 2000, Neural Computation.

[16]  Friedrich Leisch,et al.  Fitting finite mixtures of generalized linear regressions in R , 2007, Comput. Stat. Data Anal..

[17]  Andrew L. Rukhin,et al.  Tools for statistical inference , 1991 .

[18]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[19]  K. Lange A gradient algorithm locally equivalent to the EM algorithm , 1995 .

[20]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[21]  A. Mokkadem,et al.  Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algorithms , 2006, math/0610329.

[22]  D. Titterington Recursive Parameter Estimation Using Incomplete Data , 1984 .

[23]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[24]  Eric Moulines,et al.  Stability of Stochastic Approximation under Verifiable Conditions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[25]  F. Leisch FlexMix: A general framework for finite mixture models and latent class regression in R , 2004 .