Efficient Sampling for Gaussian Graphical Models via Spectral Sparsification

Motivated by a sampling problem basic to computational statistical inference, we develop a toolset based on spectral sparsification for a family of fundamental problems involving Gaussian sampling, matrix functionals, and reversible Markov chains. Drawing on the connection between Gaussian graphical models and the recent breakthroughs in spectral graph theory, we give the first nearly linear time algorithm for the following basic matrix problem: Given an n× n Laplacian matrix M and a constant −1 ≤ p ≤ 1, provide efficient access to a sparse n× n linear operator C such that M ≈ CC>, where ≈ denotes spectral similarity. When p is set to −1, this gives the first parallel sampling algorithm that is essentially optimal both in total work and randomness for Gaussian random fields with symmetric diagonally dominant (SDD) precision matrices. It only requires nearly linear work and 2n i.i.d. random univariate Gaussian samples to generate an n-dimensional i.i.d. Gaussian random sample in polylogarithmic depth. The key ingredient of our approach is an integration of spectral sparsification with multilevel method: Our algorithms are based on factoring M into a product of well-conditioned matrices, then introducing powers and replacing dense matrices with sparse approximations. We give two sparsification methods for this approach that may be of independent interest. The first invokes Maclaurin series on the factors, while the second builds on our new nearly linear time spectral sparsification algorithm for random-walk matrix polynomials. We expect these algorithmic advances will also help to strengthen the connection between machine learning and spectral graph theory, two of the most active fields in understanding large data and networks.

[1]  Babak Shahbaba,et al.  Distributed Stochastic Gradient MCMC , 2014, ICML.

[2]  Matthew J. Johnson,et al.  Analyzing Hogwild Parallel Gaussian Gibbs Sampling , 2013, NIPS.

[3]  Richard Peng,et al.  Algorithm Design Using Spectral Graph Theory , 2013 .

[4]  L. Shieh,et al.  Determining continuous-time state equations from discrete-time state equations via the principal q th root method , 1986 .

[5]  Shang-Hua Teng,et al.  Nearly-Linear Time Algorithms for Preconditioning and Solving Symmetric, Diagonally Dominant Linear Systems , 2006, SIAM J. Matrix Anal. Appl..

[6]  Edmond Chow,et al.  Preconditioned Krylov Subspace Methods for Sampling Multivariate Gaussian Distributions , 2014, SIAM J. Sci. Comput..

[7]  Ann B. Lee,et al.  Geometric diffusions as a tool for harmonic analysis and structure definition of data: multiscale methods. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[8]  Arthur Gretton,et al.  Parallel Gibbs Sampling: From Colored Fields to Thin Junction Trees , 2011, AISTATS.

[9]  Alan S. Willsky,et al.  Sampling from Gaussian graphical models using subgraph perturbations , 2013, 2013 IEEE International Symposium on Information Theory.

[10]  Nikhil Srivastava,et al.  Graph sparsification by effective resistances , 2008, SIAM J. Comput..

[11]  L. Williams,et al.  Contents , 2020, Ophthalmology (Rochester, Minn.).

[12]  N. Higham,et al.  On pth Roots of Stochastic Matrices , 2011 .

[13]  Jonathan A. Kelner,et al.  Spectral Sparsification in the Semi-streaming Setting , 2012, Theory of Computing Systems.

[14]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[15]  Shang-Hua Teng,et al.  Electrical flows, laplacian systems, and faster approximation of maximum flow in undirected graphs , 2010, STOC '11.

[16]  Nando de Freitas,et al.  An Introduction to MCMC for Machine Learning , 2004, Machine Learning.

[17]  Gary L. Miller,et al.  Approaching Optimality for Solving SDD Linear Systems , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[18]  William T. Freeman,et al.  Efficient Multiscale Sampling from Products of Gaussian Mixtures , 2003, NIPS.

[19]  Shang-Hua Teng,et al.  Spectral Sparsification of Graphs , 2008, SIAM J. Comput..

[20]  Nir Friedman,et al.  Probabilistic Graphical Models: Principles and Techniques - Adaptive Computation and Machine Learning , 2009 .

[21]  R. Bhatia Positive Definite Matrices , 2007 .

[22]  Daniel A. Spielman,et al.  Faster approximate lossy generalized flow via interior point algorithms , 2008, STOC.

[23]  Gary L. Miller,et al.  Improved Parallel Algorithms for Spanners and Hopsets , 2015, SPAA.

[24]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[25]  Zeyuan Allen Zhu,et al.  A simple, combinatorial algorithm for solving SDD systems in nearly-linear time , 2013, STOC '13.

[26]  Shang-Hua Teng,et al.  Nearly-linear time algorithms for graph partitioning, graph sparsification, and solving linear systems , 2003, STOC '04.

[27]  Richard Peng,et al.  An efficient parallel solver for SDD linear systems , 2013, STOC.