A comparison of algorithms for inference and learning in probabilistic graphical models

Research into methods for reasoning under uncertainty is currently one of the most exciting areas of artificial intelligence, largely because it has recently become possible to record, store, and process large amounts of data. While impressive achievements have been made in pattern classification problems such as handwritten character recognition, face detection, speaker identification, and prediction of gene function, it is even more exciting that researchers are on the verge of introducing systems that can perform large-scale combinatorial analyses of data, decomposing the data into interacting components. For example, computational methods for automatic scene analysis are now emerging in the computer vision community. These methods decompose an input image into its constituent objects, lighting conditions, motion patterns, etc. Two of the main challenges are finding effective representations and models in specific applications and finding efficient algorithms for inference and learning in these models. In this paper, we advocate the use of graph-based probability models and their associated inference and learning algorithms. We review exact techniques and various approximate, computationally efficient techniques, including iterated conditional modes, the expectation maximization (EM) algorithm, Gibbs sampling, the mean field method, variational techniques, structured variational techniques and the sum-product algorithm ("loopy" belief propagation). We describe how each technique can be applied in a vision model of multiple, occluding objects and contrast the behaviors and performances of the techniques using a unifying cost function, free energy.

[1]  L. Koenigsberger Hermann von Helmholtz , 2008 .

[2]  H. H. Hermann von Helmholtz , 1906, Nature.

[3]  O. Barndorff-Nielsen Information And Exponential Families , 1970 .

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[6]  Donald Geman,et al.  Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images , 1984, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  J. Besag On the Statistical Analysis of Dirty Pictures , 1986 .

[8]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[10]  Stuart German,et al.  Stochastic relaxation, Gibbs distributions, and the Bayesian restoration of images , 1988 .

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12]  Michael I. Jordan,et al.  Advances in Neural Information Processing Systems 30 , 1995 .

[13]  Edward H. Adelson,et al.  Ordinal characteristics of transparency. , 1990 .

[14]  Radford M. Neal Bayesian Mixture Modeling by Monte Carlo Simulation , 1991 .

[15]  A. Dawid,et al.  A comparison of sequential learning methods for incomplete data , 1995 .

[16]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[17]  D. Mackay,et al.  Bayesian neural networks and density networks , 1995 .

[18]  Brendan J. Frey,et al.  A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[19]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1998, Learning in Graphical Models.

[20]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[21]  Michael I. Jordan Graphical Models , 2003 .

[22]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[23]  Brendan J. Frey,et al.  Estimating mixture models of images and inferring spatial transformations using the EM algorithm , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[24]  Brendan J. Frey,et al.  Transformed component analysis: joint estimation of spatial transformations and image components , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[25]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems , 1999, Information Science and Statistics.

[26]  Brendan J. Frey Filling in scenes by propagating probabilities through layers and into appearance models , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[27]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[28]  Brendan J. Frey,et al.  Learning Graphical Models of Images, Videos and Their Spatial Transformations , 2000, UAI.

[29]  Brendan J. Frey,et al.  Transformed hidden Markov models: estimating mixture models of images and inferring spatial transformations in video sequences , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[30]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[31]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[32]  Brendan J. Frey,et al.  Learning flexible sprites in video layers , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[33]  Brendan J. Frey,et al.  Very loopy belief propagation for unwrapping phase images , 2001, NIPS.

[34]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[35]  Brendan J. Frey,et al.  Separating appearance from deformation , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[36]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[37]  Christopher K. I. Williams,et al.  Learning About Multiple Objects in Images: Factorial Learning without Factorial Search , 2002, NIPS.

[38]  Thomas G. Dietterich,et al.  Editors. Advances in Neural Information Processing Systems , 2002 .

[39]  M. Mézard,et al.  Analytic and Algorithmic Solution of Random Satisfiability Problems , 2002, Science.

[40]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[41]  Brendan J. Frey,et al.  Epitomic analysis of appearance and shape , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[42]  Brendan J. Frey,et al.  Extending Factor Graphs so as to Unify Directed and Undirected Graphical Models , 2002, UAI.

[43]  Brendan J. Frey,et al.  Learning appearance and transparency manifolds of occluded objects in layers , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[44]  B. Frey,et al.  Transformation-Invariant Clustering Using the EM Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Michael I. Jordan,et al.  An Introduction to Variational Methods for Graphical Models , 1999, Machine Learning.

[46]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.