Obtaining EM Initial Points by Using the Primitive Initial Point and Subsampling Strategy

The EM algorithm is an efficient algorithm to obtain the ML estimate for incomplete data, but has the local optimality problem. The deterministic annealing EM (DAEM) algorithm was once proposed to solve this problem, which begins a search from the primitive initial point. Then the mes-EM algorithm was proposed: a variant of the m-EM algorithm which begins the multiple-token EM search from the primitive initial point. The mes-EM could obtain excellent solutions in compensation for rather high computing cost. This paper proposes a lighter version of the mes-EM algorithm using the subsampling strategy and evaluates its performance.

[1]  Ryohei Nakano,et al.  Threshold-based dynamic annealing for multi-thread DAEM and its extreme , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[2]  Threshold-based multi-thread EM algorithm , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[3]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[4]  Naonori Ueda,et al.  Deterministic annealing EM algorithm , 1998, Neural Networks.

[5]  David M. Rocke,et al.  Sampling and Subsampling for Cluster Analysis in Data Mining: With Applications to Sky Survey Data , 2003, Data Mining and Knowledge Discovery.

[6]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[7]  Ryohei Nakano,et al.  Landscape of a Likelihood Surface for a Gaussian Mixture and its use for the EM Algorithm , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.

[8]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.

[9]  P. Deb Finite Mixture Models , 2008 .

[10]  Christophe Biernacki,et al.  Choosing starting values for the EM algorithm for getting the highest likelihood in multivariate Gaussian mixture models , 2003, Comput. Stat. Data Anal..