On‐line expectation–maximization algorithm for latent data models

Summary.  We propose a generic on‐line (also sometimes called adaptive or recursive) version of the expectation–maximization (EM) algorithm applicable to latent variable models of independent observations. Compared with the algorithm of Titterington, this approach is more directly connected to the usual EM algorithm and does not rely on integration with respect to the complete‐data distribution. The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback–Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i.e. that of the maximum likelihood estimator. In addition, the approach proposed is also suitable for conditional (or regression) models, as illustrated in the case of the mixture of linear regressions model.

[1]  Carlos S. Kubrusly,et al.  Stochastic approximation algorithms and applications , 1973, CDC 1973.

[2]  New York Dover,et al.  ON THE CONVERGENCE PROPERTIES OF THE EM ALGORITHM , 1983 .

[3]  D. Titterington Recursive Parameter Estimation Using Incomplete Data , 1984 .

[4]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[5]  D. Ruppert,et al.  Efficient Estimations from a Slowly Convergent Robbins-Monro Process , 1988 .

[6]  Steven J. Nowlan,et al.  Soft competitive adaptation: neural network learning algorithms based on fitting statistical mixtures , 1991 .

[7]  Andrew L. Rukhin,et al.  Tools for statistical inference , 1991 .

[8]  Boris Polyak,et al.  Acceleration of stochastic approximation by averaging , 1992 .

[9]  Robert A. Jacobs,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1993, Neural Computation.

[10]  K. Lange A gradient algorithm locally equivalent to the EM algorithm , 1995 .

[11]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[12]  M. Pelletier Weak convergence rates for stochastic approximation with application to multiple targets and simulated annealing , 1998 .

[13]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[14]  Shin Ishii,et al.  On-line EM Algorithm for the Normalized Gaussian Network , 2000, Neural Computation.

[15]  Han-Fu Chen Stochastic approximation and its applications , 2002 .

[16]  C. Robert,et al.  Estimating Mixtures of Regressions , 2003 .

[17]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[18]  F. Leisch FlexMix: A general framework for finite mixture models and latent class regression in R , 2004 .

[19]  Eric Moulines,et al.  Stability of Stochastic Approximation under Verifiable Conditions , 2005, Proceedings of the 44th IEEE Conference on Decision and Control.

[20]  Pei Jung Chung,et al.  Recursive EM and SAGE-inspired algorithms with application to DOA estimation , 2005, IEEE Transactions on Signal Processing.

[21]  Eric Moulines,et al.  Recursive Em Algorithm with Applications to Doa Estimation , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[22]  A. Mokkadem,et al.  Convergence rate and averaging of nonlinear two-time-scale stochastic approximation algorithms , 2006, math/0610329.

[23]  Jalal Almhana,et al.  Online EM algorithm for mixture with application to internet traffic modeling , 2004 .

[24]  Shaojun Wang,et al.  Almost sure convergence of Titterington's recursive estimator for mixture models , 2002, Proceedings IEEE International Symposium on Information Theory,.

[25]  Friedrich Leisch,et al.  Fitting finite mixtures of generalized linear regressions in R , 2007, Comput. Stat. Data Anal..