Estimating the "Wrong" Graphical Model: Benefits in the Computation-Limited Setting

Consider the problem of joint parameter estimation and prediction in a Markov random field: that is, the model parameters are estimated on the basis of an initial set of data, and then the fitted model is used to perform prediction (e.g., smoothing, denoising, interpolation) on a new noisy observation. Working under the restriction of limited computation, we analyze a joint method in which the same convex variational relaxation is used to construct an M-estimator for fitting parameters, and to perform approximate marginalization for the prediction step. The key result of this paper is that in the computation-limited setting, using an inconsistent parameter estimator (i.e., an estimator that returns the "wrong" model even in the infinite data limit) is provably beneficial, since the resulting errors can partially compensate for errors made by using an approximate prediction technique. En route to this result, we analyze the asymptotic properties of M-estimators based on convex variational relaxations, and establish a Lipschitz stability property that holds for a broad class of convex variational methods. This stability result provides additional incentive, apart from the obvious benefit of unique global optima, for using message-passing methods based on convex variational relaxations. We show that joint estimation/prediction based on the reweighted sum-product algorithm substantially outperforms a commonly used heuristic based on ordinary sum-product.

[1]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[2]  丸山 徹 Convex Analysisの二,三の進展について , 1977 .

[3]  J. Besag Efficiency of pseudolikelihood estimation for simple Gaussian fields , 1977 .

[4]  Ing Rj Ser Approximation Theorems of Mathematical Statistics , 1980 .

[5]  L. Brown Fundamentals of statistical exponential families: with applications in statistical decision theory , 1986 .

[6]  A. F. Smith,et al.  Statistical analysis of finite mixture distributions , 1986 .

[7]  L. Younes Estimation and annealing for Gibbsian fields , 1988 .

[8]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[9]  John D. Lafferty,et al.  Inducing Features of Random Fields , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Robert D. Nowak,et al.  Wavelet-based statistical signal processing using hidden Markov models , 1998, IEEE Trans. Signal Process..

[11]  Michael I. Jordan Graphical Models , 2003 .

[12]  Benjamin Van Roy,et al.  An Analysis of Turbo Decoding with Gaussian Densities , 1999, NIPS.

[13]  Hoon Kim,et al.  Monte Carlo Statistical Methods , 2000, Technometrics.

[14]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[15]  J. Yedidia An Idiosyncratic Journey Beyond Mean Field Theory , 2000 .

[16]  Hilbert J. Kappen,et al.  Learning in higher order Boltzmann machines using linear response , 2000, Neural Networks.

[17]  M. Opper,et al.  An Idiosyncratic Journey Beyond Mean Field Theory , 2001 .

[18]  William T. Freeman,et al.  On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs , 2001, IEEE Trans. Inf. Theory.

[19]  Tom Minka,et al.  A family of algorithms for approximate Bayesian inference , 2001 .

[20]  Rüdiger L. Urbanke,et al.  The capacity of low-density parity-check codes under message-passing decoding , 2001, IEEE Trans. Inf. Theory.

[21]  Sekhar Tatikonda,et al.  Loopy Belief Propogation and Gibbs Measures , 2002, UAI.

[22]  Tim Hesterberg,et al.  Monte Carlo Strategies in Scientific Computing , 2002, Technometrics.

[23]  Yee Whye Teh,et al.  On Improving the Efficiency of the Iterative Proportional Fitting Procedure , 2003, AISTATS.

[24]  Martin J. Wainwright,et al.  Tree-reweighted belief propagation algorithms and approximate ML estimation by pseudo-moment matching , 2003, AISTATS.

[25]  Sekhar C. Tatikonda,et al.  Convergence of the sum-product algorithm , 2003, Proceedings 2003 IEEE Information Theory Workshop (Cat. No.03EX674).

[26]  Hilbert J. Kappen,et al.  Approximate Inference and Constrained Optimization , 2002, UAI.

[27]  Martin J. Wainwright,et al.  Tree-based reparameterization framework for analysis of sum-product and related algorithms , 2003, IEEE Trans. Inf. Theory.

[28]  Hilbert J. Kappen,et al.  On the properties of the Bethe approximation and loopy belief propagation on binary networks , 2004 .

[29]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[30]  Hilbert J. Kappen,et al.  Sufficient Conditions for Convergence of Loopy Belief Propagation , 2005, UAI.

[31]  John W. Fisher,et al.  Loopy Belief Propagation: Convergence and Effects of Message Errors , 2005, J. Mach. Learn. Res..

[32]  Christian P. Robert,et al.  Monte Carlo Statistical Methods (Springer Texts in Statistics) , 2005 .

[33]  Leslie Pack Kaelbling,et al.  Learning Static Object Segmentation from Motion Segmentation , 2005, AAAI.

[34]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[35]  Martin J. Wainwright,et al.  A variational principle for graphical models , 2005 .

[36]  Andrew McCallum,et al.  Piecewise Training for Undirected Models , 2005, UAI.

[37]  Martin J. Wainwright,et al.  A new class of upper bounds on the log partition function , 2002, IEEE Transactions on Information Theory.

[38]  Wim Wiegerinck Approximations with Reweighted Generalized Belief Propagation , 2005, AISTATS.

[39]  Martin J. Wainwright,et al.  Log-determinant relaxation for approximate inference in discrete Markov random fields , 2006, IEEE Transactions on Signal Processing.

[40]  Terrence J. Sejnowski,et al.  A Variational Principle for Graphical Models , 2007 .

[41]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[42]  K. Schittkowski,et al.  NONLINEAR PROGRAMMING , 2022 .