论文信息 - A Genetic Algorithm for Learning Parameters in Bayesian Networks using Expectation Maximization

A Genetic Algorithm for Learning Parameters in Bayesian Networks using Expectation Maximization

Expectation maximization (EM) is a popular algorithm for parameter estimation in situations with incomplete data. The EM algorithm has, despite its popularity, the disadvantage of often converging to local but non-global optima. Several techniques have been proposed to address this problem, for example initializing EM from multiple random starting points and then selecting the run with the highest likelihood. Unfortunately, this method is computationally expensive. In this paper, our goal is to reduce computational cost while at the same time maximizing likelihood. We propose a Genetic Algorithm for Expectation Maximization (GAEM) for learning parameters in Bayesian networks. GAEM combines the global search property of a genetic algorithm with the local search property of EM. We prove GAEM’s global convergence theoretically. Experimentally, we show that GAEM provides significant speed-ups since it tends to select more fit individuals, which converge faster, as parents for the next generation. Specifically, GAEM converges 1:5 to 7 times faster while producing better log-likelihood scores than the traditional EM algorithm.

Ole J. Mengshoel | Priya Krishnan Sundararajan

[1] Bo Thiesson,et al. Accelerating EM for Large Databases , 2001, Machine Learning.

[2] Pasi Fränti,et al. Random swap EM algorithm for Gaussian mixture models , 2012, Pattern Recognit. Lett..

[3] É. Moulines,et al. Convergence of a stochastic approximation version of the EM algorithm , 1999 .

[4] Concha Bielza,et al. A review on evolutionary algorithms in Bayesian network learning and inference tasks , 2013, Inf. Sci..

[5] R. Jennrich,et al. Acceleration of the EM Algorithm by using Quasi‐Newton Methods , 1997 .

[6] David Maxwell Chickering,et al. Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[7] Chuong B Do,et al. What is the expectation maximization algorithm? , 2008, Nature Biotechnology.

[8] Ole J. Mengshoel,et al. Age-Layered Expectation Maximization for Parameter Learning in Bayesian Networks , 2012, AISTATS.

[9] Gregory F. Cooper,et al. A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[10] M. Narasimha Murty,et al. Genetic K-means algorithm , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[11] Günter Rudolph,et al. Convergence analysis of canonical genetic algorithms , 1994, IEEE Trans. Neural Networks.

[12] Wolfgang Jank,et al. The EM Algorithm , Its Stochastic Implementation and Global Optimization : Some Challenges and Opportunities for OR , 2006 .

[13] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[14] Kathryn B. Laskey,et al. Learning Bayesian networks from incomplete data using evolutionary algorithms , 1999 .

[15] Joris M. Mooij,et al. libDAI: A Free and Open Source C++ Library for Discrete Approximate Inference in Graphical Models , 2010, J. Mach. Learn. Res..

[16] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[17] Pedro Larrañaga,et al. Learning Bayesian network structures by searching for the best ordering with genetic algorithms , 1996, IEEE Trans. Syst. Man Cybern. Part A.

[18] R. A. Leibler,et al. On Information and Sufficiency , 1951 .

[19] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[20] Xiao-Li Meng,et al. Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[21] David E. Goldberg,et al. Probabilistic Crowding: Deterministic Crowding with Probabilistic Replacement , 1999 .

[22] Ole J. Mengshoel,et al. Scaling Bayesian Network Parameter Learning with Expectation Maximization using MapReduce , 2012 .

[23] Dale Schuurmans,et al. Data perturbation for escaping local maxima in learning , 2002, AAAI/IAAI.

[24] Samir W. Mahfoud. Crowding and Preselection Revisited , 1992, PPSN.

[25] Djamel Bouchaffra,et al. Genetic-based EM algorithm for learning Gaussian mixture models , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.