Embedded trees: estimation of Gaussian Processes on graphs with cycles

Graphical models provide a powerful general framework for encoding the structure of large-scale estimation problems. However, the graphs describing typical real-world phenomena contain many cycles, making direct estimation procedures prohibitively costly. In this paper, we develop an iterative inference algorithm for general Gaussian graphical models. It operates by exactly solving a series of modified estimation problems on spanning trees embedded within the original cyclic graph. When these subproblems are suitably chosen, the algorithm converges to the correct conditional means. Moreover, and in contrast to many other iterative methods, the tree-based procedures we propose can also be used to calculate exact error variances. Although the conditional mean iteration is effective for quite densely connected graphical models, the error variance computation is most efficient for sparser graphs. In this context, we present a modeling example suggesting that very sparsely connected graphs with cycles may provide significant advantages relative to their tree-structured counterparts, thanks both to the expressive power of these models and to the efficient inference algorithms developed herein. The convergence properties of the proposed tree-based iterations are characterized both analytically and experimentally. In addition, by using the basic tree-based iteration to precondition the conjugate gradient method, we develop an alternative, accelerated iteration that is finitely convergent. Simulation results are presented that demonstrate this algorithm's effectiveness on several inference problems, including a prototype distributed sensing application.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  P. Stein Some general theorems on iterants , 1952 .

[3]  H. H. Rachford,et al.  The Numerical Solution of Parabolic and Elliptic Differential Equations , 1955 .

[4]  R. Varga,et al.  Implicit alternating direction methods , 1959 .

[5]  G. Habetler,et al.  An Alternating-Direction-Implicit Iteration Technique , 1960 .

[6]  E. Wachspress Optimum Alternating-Direction-Implicit Iteration Parameters for a Model Problem , 1962 .

[7]  Robert G. Gallager,et al.  Low-density parity-check codes , 1962, IRE Trans. Inf. Theory.

[8]  E. Wachspress Extended Application of Alternating Direction Implicit Iteration Model Problem Theory , 1963 .

[9]  Louis A. Hageman,et al.  Iterative Solution of Large Linear Systems. , 1971 .

[10]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[11]  S. Venit,et al.  Numerical Analysis: A Second Course. , 1974 .

[12]  H. Akaike Markovian Representation of Stochastic Processes by Canonical Variables , 1975 .

[13]  J. Ortega,et al.  Extensions of the Ostrowski-Reich theorem for SOR iterations , 1979 .

[14]  S. R. Searle,et al.  On Deriving the Inverse of a Sum of Matrices , 1981 .

[15]  Michael A. Saunders,et al.  LSQR: An Algorithm for Sparse Linear Equations and Sparse Least Squares , 1982, TOMS.

[16]  Gene H. Golub,et al.  Matrix computations , 1983 .

[17]  Donald Geman,et al.  Gibbs distributions and the bayesian restoration of images , 1984 .

[18]  L. Adams m-Step Preconditioned Conjugate Gradient Methods , 1985 .

[19]  D. O’Leary,et al.  Multi-Splittings of Matrices and Parallel Solution of Linear Systems , 1985 .

[20]  U. Desai,et al.  A realization approach to stochastic model reduction , 1985 .

[21]  T. Speed,et al.  Gaussian Markov Distributions over Finite Graphs , 1986 .

[22]  C. Kelley Iterative Methods for Linear and Nonlinear Equations , 1987 .

[23]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[24]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[25]  R. White Multisplitting of a symmetric positive definite matrix , 1990 .

[26]  K. Arun,et al.  Balanced approximation of stochastic systems , 1990 .

[27]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[28]  A. Willsky,et al.  Kalman filtering and Riccati equations for descriptor systems , 1992 .

[29]  Owe Axelsson,et al.  Bounds of Eigenvalues of Preconditioned Matrices , 1992, SIAM J. Matrix Anal. Appl..

[30]  Robert Haining,et al.  Statistics for spatial data: by Noel Cressie, 1991, John Wiley & Sons, New York, 900 p., ISBN 0-471-84336-9, US $89.95 , 1993 .

[31]  Mike Rees,et al.  5. Statistics for Spatial Data , 1993 .

[32]  S. Kay Fundamentals of statistical signal processing: estimation theory , 1993 .

[33]  D. O’Leary,et al.  A Krylov multisplitting algorithm for solving linear systems of equations , 1993 .

[34]  K. C. Chou,et al.  Multiscale recursive estimation, data fusion, and regularization , 1994, IEEE Trans. Autom. Control..

[35]  Bart De Moor,et al.  N4SID: Subspace algorithms for the identification of combined deterministic-stochastic systems , 1994, Autom..

[36]  W. Clem Karl,et al.  Efficient multiscale regularization with applications to the computation of optical flow , 1994, IEEE Trans. Image Process..

[37]  Richard Barrett,et al.  Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods , 1994, Other Titles in Applied Mathematics.

[38]  W. W. Irving,et al.  Multiscale stochastic realization and model identification with applications to large-scale estimation problems , 1995 .

[39]  Carl Wunsch,et al.  Multiresolution optimal interpolation and statistical analysis of TOPEX/POSEIDON satellite altimetry , 1995 .

[40]  Dimitri P. Bertsekas,et al.  Dynamic Programming and Optimal Control, Two Volume Set , 1995 .

[41]  Paul W. Fieguth,et al.  Multiresolution optimal interpolation and statistical analysis of TOPEX/POSEIDON satellite altimetry , 1995, IEEE Transactions on Geoscience and Remote Sensing.

[42]  Dimitri P. Bertsekas,et al.  Generic rank-one corrections for value iteration in Markovian decision problems , 1995, Oper. Res. Lett..

[43]  Bart De Moor,et al.  A unifying theorem for three subspace system identification algorithms , 1995, Autom..

[44]  Anders Lindquist,et al.  Canonical correlation analysis, approximate covariance extension, and identification of stationary time series , 1996, Autom..

[45]  A. Willsky,et al.  A multiresolution methodology for signal-level fusion and data assimilation with applications to remote sensing , 1997, Proc. IEEE.

[46]  Michael M. Daniel,et al.  Multiresolution statistical modeling with application to modeling groundwater flow , 1997 .

[47]  James Demmel,et al.  Applied Numerical Linear Algebra , 1997 .

[48]  Brendan J. Frey,et al.  A Revolution: Belief Propagation in Graphs with Cycles , 1997, NIPS.

[49]  John K. Thomas,et al.  Wiener filters in canonical coordinates for transform coding, filtering, and quantizing , 1998, IEEE Trans. Signal Process..

[50]  Jung-Fu Cheng,et al.  Turbo Decoding as an Instance of Pearl's "Belief Propagation" Algorithm , 1998, IEEE J. Sel. Areas Commun..

[51]  Michael I. Jordan Learning in Graphical Models , 1999, NATO ASI Series.

[52]  Michael I. Jordan Graphical Models , 2003 .

[53]  Paul W. Fieguth,et al.  Efficient Multiresolution Counterparts to Variational Methods for Surface Reconstruction , 1998, Comput. Vis. Image Underst..

[54]  Z. Cao,et al.  Symmetric multisplitting of a symmetric positive definite matrix , 1998 .

[55]  Michael I. Jordan,et al.  Loopy Belief Propagation for Approximate Inference: An Empirical Study , 1999, UAI.

[56]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[57]  Austin B. Frakt Internal multiscale autoregressive processes, stochastic realization, and covariance extension , 1999 .

[58]  J. Ortega,et al.  Numerical Analysis: A Second Course , 1974 .

[59]  Robert J. McEliece,et al.  The generalized distributive law , 2000, IEEE Trans. Inf. Theory.

[60]  宮沢 政清,et al.  P. Bremaud 著, Markov Chains, (Gibbs fields, Monte Carlo simulation and Queues), Springer-Verlag, 1999年 , 2000 .

[61]  Yair Weiss,et al.  Correctness of Local Probability Propagation in Graphical Models with Loops , 2000, Neural Computation.

[62]  Martin J. Wainwright,et al.  Tree-Based Modeling and Estimation of Gaussian Processes on Graphs with Cycles , 2000, NIPS.

[63]  J. Berryman Analysis of Approximate Inverses in Tomography II. Iterative Inverses , 2000 .

[64]  John Odentrantz,et al.  Markov Chains: Gibbs Fields, Monte Carlo Simulation, and Queues , 2000, Technometrics.

[65]  J. Yedidia An Idiosyncratic Journey Beyond Mean Field Theory , 2000 .

[66]  W. Freeman,et al.  Generalized Belief Propagation , 2000, NIPS.

[67]  Paul W. Fieguth,et al.  Multiscale methods for the segmentation and reconstruction of signals and images , 2000, IEEE Trans. Image Process..

[68]  Louis L. Scharf,et al.  Canonical coordinates and the geometry of inference, rate, and capacity , 2000, IEEE Trans. Signal Process..

[69]  Brendan J. Frey,et al.  Factor graphs and the sum-product algorithm , 2001, IEEE Trans. Inf. Theory.

[70]  Michael K. Schneider,et al.  Krylov Subspace Estimation , 2000, SIAM J. Sci. Comput..

[71]  M. Opper,et al.  An Idiosyncratic Journey Beyond Mean Field Theory , 2001 .

[72]  William T. Freeman,et al.  Correctness of Belief Propagation in Gaussian Graphical Models of Arbitrary Topology , 1999, Neural Computation.

[73]  Alan S. Willsky,et al.  Computationally Efficient Stochastic Realization for Internal Multiscale Autoregressive Models , 2001, Multidimens. Syst. Signal Process..

[74]  Alan S. Willsky,et al.  A canonical correlations approach to multiscale stochastic realization , 2001, IEEE Trans. Autom. Control..

[75]  Martin J. Wainwright,et al.  Tree-based reparameterization for approximate inference on loopy graphs , 2001, NIPS.

[76]  Benjamin Van Roy,et al.  An analysis of belief propagation on the turbo decoding graph with Gaussian densities , 2001, IEEE Trans. Inf. Theory.

[77]  Shun-ichi Amari,et al.  Information geometry on hierarchy of probability distributions , 2001, IEEE Trans. Inf. Theory.

[78]  Martin J. Wainwright,et al.  Stochastic processes on graphs with cycles: geometric and variational approaches , 2002 .

[79]  A. Willsky Multiresolution Markov models for signal and image processing , 2002, Proc. IEEE.

[80]  William T. Freeman,et al.  Understanding belief propagation and its generalizations , 2003 .

[81]  Gerhard Lakemeyer,et al.  Exploring artificial intelligence in the new millennium , 2003 .

[82]  Bruce Hendrickson,et al.  Support Theory for Preconditioning , 2003, SIAM J. Matrix Anal. Appl..

[83]  D. Chen,et al.  VAIDYA'S PRECONDITIONERS: IMPLEMENTATION AND EXPERIMENTAL STUDY , 2003 .

[84]  Martin J. Wainwright,et al.  Tree-based reparameterization framework for analysis of sum-product and related algorithms , 2003, IEEE Trans. Inf. Theory.

[85]  Panganamala Ramana Kumar,et al.  Extended message passing algorithm for inference in loopy Gaussian graphical models , 2004, Ad Hoc Networks.

[86]  Sivan Toledo,et al.  Maximum‐weight‐basis preconditioners , 2004, Numer. Linear Algebra Appl..

[87]  Yee Whye Teh,et al.  Linear Response Algorithms for Approximate Inference in Graphical Models , 2004, Neural Computation.

[88]  Richard Szeliski,et al.  Bayesian modeling of uncertainty in low-level vision , 2011, International Journal of Computer Vision.

[89]  William T. Freeman,et al.  Learning Low-Level Vision , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[90]  William T. Freeman,et al.  Constructing free-energy approximations and generalized belief propagation algorithms , 2005, IEEE Transactions on Information Theory.

[91]  Sivan Toledo,et al.  Support-Graph Preconditioners , 2005, SIAM J. Matrix Anal. Appl..