Modélisation probabiliste et inférence par l'algorithme Belief Propagation

On s'interesse a la construction et l'estimation - a partir d'observations incompletes - de modeles de variables aleatoires a valeurs reelles sur un graphe. Ces modeles doivent etre adaptes a un probleme de regression non standard ou l'identite des variables observees (et donc celle des variables a predire) varie d'une instance a l'autre. La nature du probleme et des donnees disponibles nous conduit a modeliser le reseau sous la forme d'un champ markovien aleatoire, choix justifie par le principe de maximisation d'entropie de Jaynes. L'outil de prediction choisi dans ces travaux est l'algorithme Belief Propagation - dans sa version classique ou gaussienne - dont la simplicite et l'efficacite permettent son utilisation sur des reseaux de grande taille. Apres avoir fourni un nouveau resultat sur la stabilite locale des points fixes de l'algorithme, on etudie une approche fondee sur un modele d'Ising latent ou les dependances entre variables reelles sont encodees a travers un reseau de variables binaires. Pour cela, on propose une definition de ces variables basee sur les fonctions de repartition des variables reelles associees. Pour l'etape de prediction, il est necessaire de modifier l'algorithme Belief Propagation pour imposer des contraintes de type bayesiennes sur les distributions marginales des variables binaires. L'estimation des parametres du modele peut aisement se faire a partir d'observations de paires. Cette approche est en fait une maniere de resoudre le probleme de regression en travaillant sur les quantiles. D'autre part, on propose un algorithme glouton d'estimation de la structure et des parametres d'un champ markovien gaussien, base sur l'algorithme Iterative Proportional Scaling. Cet algorithme produit a chaque iteration un nouveau modele dont la vraisemblance, ou une approximation de celle-ci dans le cas d'observations incompletes, est superieure a celle du modele precedent. Cet algorithme fonctionnant par perturbation locale, il est possible d'imposer des contraintes spectrales assurant une meilleure compatibilite des modeles obtenus avec la version gaussienne de Belief Propagation. Les performances des differentes approches sont illustrees par des experimentations numeriques sur des donnees synthetiques.

[1]  H. Bethe Statistical Theory of Superlattices , 1935 .

[2]  P. Halmos Finite-Dimensional Vector Spaces , 1960 .

[3]  Frederic M. Lord,et al.  Estimation of Parameters from Incomplete Data , 1954 .

[4]  C. Berge Théorie des graphes et ses applications , 1958 .

[5]  Billy Joe Power Finite Dimensional Vector Space , 1960 .

[6]  Andrew J. Viterbi,et al.  Error bounds for convolutional codes and an asymptotically optimum decoding algorithm , 1967, IEEE Trans. Inf. Theory.

[7]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[8]  J. M. Hammersley,et al.  Markov fields on finite graphs and lattices , 1971 .

[9]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[10]  John Cocke,et al.  Optimal decoding of linear codes for minimizing symbol error rate (Corresp.) , 1974, IEEE Trans. Inf. Theory.

[11]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[12]  D. J. Uherka,et al.  On the Continuous Dependence of the Roots of a Polynomial on its Coefficients , 1977 .

[13]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[14]  R. Baxter Exactly solved models in statistical mechanics , 1982 .

[15]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[16]  Sompolinsky,et al.  Spin-glass models of neural networks. , 1985, Physical review. A, General physics.

[17]  T. Speed,et al.  Gaussian Markov Distributions over Finite Graphs , 1986 .

[18]  Jack K. Wolf,et al.  On Tail Biting Convolutional Codes , 1986, IEEE Trans. Commun..

[19]  M. Mézard,et al.  Spin Glass Theory and Beyond , 1987 .

[20]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[21]  Hans-Otto Georgii,et al.  Gibbs Measures and Phase Transitions , 1988 .

[22]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[23]  P. Diaconis,et al.  Geometric Bounds for Eigenvalues of Markov Chains , 1991 .

[24]  Shun-ichi Amari,et al.  Information geometry of Boltzmann machines , 1992, IEEE Trans. Neural Networks.

[25]  Enrique Castillo,et al.  Conditionally Specified Distributions , 1992 .

[26]  T. Speed,et al.  Characterizing a joint probability distribution by conditionals , 1993 .

[27]  Axel Klar,et al.  Mathematical Models for Vehicular Traffic , 1995 .

[28]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[29]  D. J. Hartfiel System behavior in quotient systems , 1997 .

[30]  Erhard Cramer Conditional Iterative Proportional Fitting for Gaussian Distributions , 1998 .

[31]  M. Talagrand Rigorous results for the Hopfield model with many patterns , 1998 .

[32]  Geoffrey E. Hinton,et al.  A View of the Em Algorithm that Justifies Incremental, Sparse, and other Variants , 1998, Learning in Graphical Models.

[33]  R.J. McEliece,et al.  Iterative decoding on graphs with a single cycle , 1998, Proceedings. 1998 IEEE International Symposium on Information Theory (Cat. No.98CH36252).

[34]  Michael I. Jordan Graphical Models , 1998 .

[35]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[36]  Ralph Baierlein Thermal Physics: The Free Energies , 1999 .

[37]  A. Schadschneider,et al.  Statistical physics of vehicular traffic and some related systems , 2000, cond-mat/0007053.

[38]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[39]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[40]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[41]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[42]  Tom Minka,et al.  Expectation Propagation for approximate Bayesian inference , 2001, UAI.

[43]  Yw Teh,et al.  Passing and Bouncing Messages for Generalised Inference , 2001 .

[44]  D. Mackay A conversation about the Bethe free energy and sum-product , 2001 .

[45]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[46]  W. Freeman,et al.  Bethe free energy, Kikuchi approximations, and belief propagation algorithms , 2001 .

[47]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[48]  David R. Karger,et al.  Learning Markov networks: maximum bounded tree-width graphs , 2001, SODA '01.

[49]  M. Opper,et al.  Comparing the Mean Field Method and Belief Propagation for Approximate Inference in MRFs , 2001 .

[50]  Tom Heskes,et al.  Stable Fixed Points of Loopy Belief Propagation Are Local Minima of the Bethe Free Energy , 2002, NIPS.

[51]  Sekhar Tatikonda,et al.  Loopy Belief Propogation and Gibbs Measures , 2002, UAI.

[52]  Billy M. Williams,et al.  Comparison of parametric and nonparametric models for traffic flow forecasting , 2002 .

[53]  Martin J. Wainwright,et al.  Stochastic processes on graphs with cycles: geometric and variational approaches , 2002 .

[54]  X. Jin Factor graphs and the Sum-Product Algorithm , 2002 .

[55]  Alan L. Yuille,et al.  CCCP Algorithms to Minimize the Bethe and Kikuchi Free Energies: Convergent Alternatives to Belief Propagation , 2002, Neural Computation.

[56]  T. Heskes Stable Fixed Points of Loopy Belief Propagation Are Minima of the Bethe Free Energy , 2002 .

[57]  Tom Heskes,et al.  Fractional Belief Propagation , 2002, NIPS.

[58]  William T. Freeman,et al.  Nonparametric belief propagation , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[59]  Martin J. Wainwright,et al.  Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching , 2003, AISTATS.

[60]  Freda Kemp,et al.  An Introduction to Sequential Monte Carlo Methods , 2003 .

[61]  Yee Whye Teh,et al.  Approximate inference in Boltzmann machines , 2003, Artif. Intell..

[62]  Adnan Darwiche,et al.  On the Revision of Probabilistic Beliefs using Uncertain Evidence , 2003, IJCAI.

[63]  Nicole A. Lazar,et al.  Statistical Analysis With Missing Data , 2003, Technometrics.

[64]  Hilbert J. Kappen,et al.  On the properties of the Bethe approximation and loopy belief propagation on binary networks , 2004 .

[65]  Tom Heskes,et al.  On the Uniqueness of Loopy Belief Propagation Fixed Points , 2004, Neural Computation.

[66]  T. Raiko,et al.  Partially observed values , 2004, 2004 IEEE International Joint Conference on Neural Networks (IEEE Cat. No.04CH37541).

[67]  Jirí Vomlel,et al.  A Prototypical System for Soft Evidential Update , 2004, Applied Intelligence.

[68]  H.-A. Loeliger,et al.  An introduction to factor graphs , 2004, IEEE Signal Process. Mag..

[69]  M. Tribus,et al.  Probability theory: the logic of science , 2003 .

[70]  Kouichi Murakami,et al.  Stability for non-hyperbolic fixed points of scalar difference equations , 2005 .

[71]  John W. Fisher,et al.  Loopy Belief Propagation: Convergence and Effects of Message Errors , 2005, J. Mach. Learn. Res..

[72]  R. Herbrich Minimising the Kullback-Leibler Divergence , 2005 .

[73]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[74]  Thomas P. Minka,et al.  Divergence measures and message passing , 2005 .

[75]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[76]  J. Rosenthal,et al.  Markov Chain Monte Carlo , 2018 .

[77]  Yun Peng,et al.  Belief Update in Bayesian Networks Using Uncertain Evidence , 2006, 2006 18th IEEE International Conference on Tools with Artificial Intelligence (ICTAI'06).

[78]  Dmitry M. Malioutov,et al.  Walk-Sums and Belief Propagation in Gaussian Graphical Models , 2006, J. Mach. Learn. Res..

[79]  Feng Cheng Chang,et al.  Inversion of a perturbed matrix , 2006, Appl. Math. Lett..

[80]  Vladimir Kolmogorov,et al.  Convergent Tree-Reweighted Message Passing for Energy Minimization , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[81]  Michael Chertkov,et al.  Loop series for discrete statistical models on graphs , 2006, ArXiv.

[82]  Nobuyuki Taga,et al.  On the Convergence of Loopy Belief Propagation Algorithm for Different Update Rules , 2006, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[83]  Martin J. Wainwright,et al.  Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting , 2006, J. Mach. Learn. Res..

[84]  Arnaud de La Fortelle,et al.  Belief-Propagation Algorithm for a Traffic Prediction System based on Probe Vehicles , 2006 .

[85]  Stephan Norbert Winkler Uniqueness of Gibbs measures with application to Gibbs sampling and the Sum -Product algorithm , 2007 .

[86]  Hilbert J. Kappen,et al.  Sufficient Conditions for Convergence of the Sum–Product Algorithm , 2005, IEEE Transactions on Information Theory.

[87]  Cyril Furtlehner,et al.  Belief Propagation and Bethe approximation for Traffic Prediction , 2007 .

[88]  Michael Chertkov,et al.  Loop Calculus and Belief Propagation for q-ary Alphabet: Loop Tower , 2007, 2007 IEEE International Symposium on Information Theory.

[89]  David J. Spiegelhalter,et al.  Probabilistic Networks and Expert Systems - Exact Computational Methods for Bayesian Networks , 1999, Information Science and Statistics.

[90]  Arnaud de La Fortelle,et al.  Statistical Physics Algorithms for Traffic Reconstruction , 2007, ERCIM News.

[91]  Vicenç Gómez,et al.  Truncating the Loop Series Expansion for Belief Propagation , 2006, J. Mach. Learn. Res..

[92]  Arnaud de La Fortelle,et al.  A Belief Propagation Approach to Traffic Prediction using Probe Vehicles , 2007, 2007 IEEE Intelligent Transportation Systems Conference.

[93]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[94]  E. Seneta Non-negative Matrices and Markov Chains , 2008 .

[95]  Lakhmi C. Jain,et al.  Introduction to Bayesian Networks , 2008 .

[96]  Danny Bickson,et al.  Gaussian Belief Propagation: Theory and Aplication , 2008, 0811.2518.

[97]  Kazuyuki Tanaka,et al.  Approximate Learning Algorithm in Boltzmann Machines , 2009, Neural Computation.

[98]  P. Abbeel,et al.  Path and travel time inference from GPS probe vehicle data , 2009 .

[99]  Amir Globerson,et al.  Convergent message passing algorithms - a unifying view , 2009, UAI.

[100]  Kenji Fukumizu,et al.  Graph Zeta Function in the Bethe Free Energy and Loopy Belief Propagation , 2009, NIPS.

[101]  Alexandre M. Bayen,et al.  Evaluation of traffic data obtained via GPS-enabled mobile phones: The Mobile Century field experiment , 2009 .

[102]  Thierry Mora,et al.  Constraint satisfaction problems and neural networks: A statistical physics perspective , 2008, Journal of Physiology-Paris.

[103]  Anne Auger,et al.  Learning Multiple Belief Propagation Fixed Points for Real Time Inference , 2009, Physica A: Statistical Mechanics and its Applications.

[104]  Zoubin Ghahramani,et al.  Choosing a Variable to Clamp: Approximate Inference Using Conditioned Belief Propagation , 2009 .

[105]  Edwin T. Jaynes Prior Probabilities , 2010, Encyclopedia of Machine Learning.

[106]  Fabien Moutarde,et al.  Spatial and temporal analysis of traffic states on large scale networks , 2010, 13th International IEEE Conference on Intelligent Transportation Systems.

[107]  Yusuke Watanabe,et al.  Discrete geometric analysis of message passing algorithm on graphs , 2010, ArXiv.

[108]  Xihong Lin,et al.  VARIABLE SELECTION AND ESTIMATION WITH THE SEAMLESS-L0 PENALTY , 2011 .

[109]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[110]  Yufei Han,et al.  Pairwise MRF Calibration by Perturbation of the Bethe Reference Point , 2012, ArXiv.

[111]  W. Marsden I and J , 2012 .

[112]  Xihong Lin,et al.  Variable selection and estimation with the seamless-L0 penalty models , 2012 .

[113]  Gal Elidan,et al.  Nonparanormal Belief Propagation (NPNBP) , 2012, NIPS.