A provably convergent alternating minimization method for mean field inference

Mean-Field is an efficient way to approximate a posterior distribution in complex graphical models and constitutes the most popular class of Bayesian variational approximation methods. In most applications, the mean field distribution parameters are computed using an alternate coordinate minimization. However, the convergence properties of this algorithm remain unclear. In this paper, we show how, by adding an appropriate penalization term, we can guarantee convergence to a critical point, while keeping a closed form update at each step. A convergence rate estimate can also be derived based on recent results in non-convex optimization.

[1]  Charles M. Bishop,et al.  Variational Message Passing , 2005, J. Mach. Learn. Res..

[2]  Robert E. Mahony,et al.  Convergence of the Iterates of Descent Methods for Analytic Cost Functions , 2005, SIAM J. Optim..

[3]  Matthew J. Beal Variational algorithms for approximate Bayesian inference , 2003 .

[4]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[5]  Vladimir Eidelman,et al.  Polylingual Tree-Based Topic Models for Translation Domain Adaptation , 2014, ACL.

[6]  Greg M. Allenby,et al.  A Large-Scale Marketing Model using Variational Bayes Inference for Sparse Transaction Data , 2014 .

[7]  S. Łojasiewicz Ensembles semi-analytiques , 1965 .

[8]  Hédy Attouch,et al.  On the convergence of the proximal algorithm for nonsmooth functions involving analytic features , 2008, Math. Program..

[9]  Benar Fux Svaiter,et al.  Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods , 2013, Math. Program..

[10]  M. J. D. Powell,et al.  On search directions for minimization algorithms , 1973, Math. Program..

[11]  John N. Tsitsiklis,et al.  Parallel and distributed computation , 1989 .

[12]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[13]  King-Sun Fu,et al.  Pattern Recognition and Machine Learning , 2012 .

[14]  Pascal Fua,et al.  Multicamera People Tracking with a Probabilistic Occupancy Map , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[16]  P. Tseng Convergence of a Block Coordinate Descent Method for Nondifferentiable Minimization , 2001 .

[17]  Hédy Attouch,et al.  Proximal Alternating Minimization and Projection Methods for Nonconvex Problems: An Approach Based on the Kurdyka-Lojasiewicz Inequality , 2008, Math. Oper. Res..