Group Sparse Priors for Covariance Estimation

Recently it has become popular to learn sparse Gaussian graphical models (GGMs) by imposing l1 or group l1, 2 penalties on the elements of the precision matrix. This penalized likelihood approach results in a tractable convex optimization problem. In this paper, we reinterpret these results as performing MAP estimation under a novel prior which we call the group l1 and l1, 2 positive-definite matrix distributions. This enables us to build a hierarchical model in which the l1 regularization terms vary depending on which group the entries are assigned to, which in turn allows us to learn block structured sparse GGMs with unknown group assignments. Exact inference in this hierarchical model is intractable, due to the need to compute the normalization constant of these matrix distributions. However, we derive upper bounds on the partition functions, which lets us use fast variational inference (optimizing a lower bound on the joint posterior). We show that on two real world data sets (motion capture and financial data), our method which infers the block structure outperforms a method that uses a fixed block structure, which in turn outperforms baseline methods that ignore block structure.

[1]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[2]  James G. Scott,et al.  Feature-Inclusion Stochastic Search for Gaussian Graphical Models , 2008 .

[3]  Stephen J. Wright,et al.  Simultaneous Variable Selection , 2005, Technometrics.

[4]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[5]  Stephen Gould,et al.  Projected Subgradient Methods for Learning Sparse Gaussians , 2008, UAI.

[6]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[7]  Mark W. Schmidt,et al.  Optimizing Costly Functions with Simple Constraints: A Limited-Memory Projected Quasi-Newton Algorithm , 2009, AISTATS.

[8]  Zoubin Ghahramani,et al.  Propagation Algorithms for Variational Bayesian Learning , 2000, NIPS.

[9]  Christophe Ambroise,et al.  Inferring sparse Gaussian graphical models with latent structure , 2008, 0810.3177.

[10]  M. West,et al.  Sparse graphical models for exploring gene expression data , 2004 .

[11]  M. A. Gómez–Villegas,et al.  A MATRIX VARIATE GENERALIZATION OF THE POWER EXPONENTIAL FAMILY OF DISTRIBUTIONS , 2002 .

[12]  Kevin P. Murphy,et al.  Sparse Gaussian graphical models with unknown block structure , 2009, ICML '09.

[13]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[14]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .