Fused Multiple Graphical Lasso

In this paper, we consider the problem of estimating multiple graphical models simultaneously using the fused lasso penalty, which encourages adjacent graphs to share similar structures. A motivating example is the analysis of brain networks of Alzheimer's disease using neuroimaging data. Specifically, we may wish to estimate a brain network for the normal controls (NC), a brain network for the patients with mild cognitive impairment (MCI), and a brain network for Alzheimer's patients (AD). We expect the two brain networks for NC and MCI to share common structures but not to be identical to each other; similarly for the two brain networks for MCI and AD. The proposed formulation can be solved using a second-order method. Our key technical contribution is to establish the necessary and sufficient condition for the graphs to be decomposable. Based on this key property, a simple screening rule is presented, which decomposes the large graphs into small subgraphs and allows an efficient estimation of multiple independent (small) subgraphs, dramatically reducing the computational cost. We perform experiments on both synthetic and real data; our results demonstrate the effectiveness and efficiency of the proposed approach.

[1]  O. SIAMJ. SMOOTH OPTIMIZATION APPROACH FOR SPARSE COVARIANCE SELECTION∗ , 2009 .

[2]  Trevor J. Hastie,et al.  Exact Covariance Thresholding into Connected Components for Large-Scale Graphical Lasso , 2011, J. Mach. Learn. Res..

[3]  Michael A. Saunders,et al.  Proximal Newton-Type Methods for Minimizing Composite Functions , 2012, SIAM J. Optim..

[4]  Paul Tseng,et al.  A coordinate gradient descent method for nonsmooth separable minimization , 2008, Math. Program..

[5]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[6]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[7]  J. Friedman,et al.  New Insights and Faster Computations for the Graphical Lasso , 2011 .

[8]  Katya Scheinberg,et al.  IBM Research Report SINCO - A Greedy Coordinate Ascent Method for Sparse Inverse Covariance Selection Problem , 2009 .

[9]  Takashi Washio,et al.  Common Substructure Learning of Multiple Graphical Gaussian Models , 2011, ECML/PKDD.

[10]  Zhaosong Lu,et al.  Adaptive First-Order Methods for General Sparse Inverse Covariance Selection , 2009, SIAM J. Matrix Anal. Appl..

[11]  Chia-Hua Ho,et al.  An improved GLMNET for l1-regularized logistic regression , 2011, J. Mach. Learn. Res..

[12]  Jorge Nocedal,et al.  Newton-Like Methods for Sparse Inverse Covariance Estimation , 2012, NIPS.

[13]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[14]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[15]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[16]  Trevor J. Hastie,et al.  The Graphical Lasso: New Insights and Alternatives , 2011, Electronic journal of statistics.

[17]  Laurent Condat,et al.  A Direct Algorithm for 1-D Total Variation Denoising , 2013, IEEE Signal Processing Letters.

[18]  Jorge Nocedal,et al.  An inexact successive quadratic approximation method for L-1 regularized optimization , 2016, Math. Program..

[19]  Yong Zhang,et al.  An augmented Lagrangian approach for sparse principal component analysis , 2009, Mathematical Programming.

[20]  Xiaoming Yuan,et al.  Alternating Direction Method for Covariance Selection Models , 2011, Journal of Scientific Computing.

[21]  Volkan Cevher,et al.  A proximal Newton framework for composite minimization: Graph learning without Cholesky decompositions and matrix inversions , 2013, ICML.

[22]  Le Song,et al.  Estimating time-varying networks , 2008, ISMB 2008.

[23]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[24]  Kuncheng Li,et al.  Altered functional connectivity in early Alzheimer's disease: A resting‐state fMRI study , 2007, Human brain mapping.

[25]  C. Grady,et al.  Intercorrelations of regional cerebral glucose metabolic rates in Alzheimer's disease , 1987, Brain Research.

[26]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[27]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[28]  Jing Li,et al.  Learning Brain Connectivity of Alzheimer's Disease from Neuroimaging Data , 2009, NIPS.

[29]  Larry A. Wasserman,et al.  Stability Approach to Regularization Selection (StARS) for High Dimensional Graphical Models , 2010, NIPS.

[30]  Lu Li,et al.  An inexact interior point method for L1-regularized sparse covariance selection , 2010, Math. Program. Comput..

[31]  Larry A. Wasserman,et al.  Time varying undirected graphs , 2008, Machine Learning.

[32]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[33]  Dimitris Samaras,et al.  Multi-Task Learning of Gaussian Graphical Models , 2010, ICML.

[34]  Alexandre d'Aspremont,et al.  First-Order Methods for Sparse Covariance Selection , 2006, SIAM J. Matrix Anal. Appl..

[35]  Jieping Ye,et al.  Feature grouping and selection over an undirected graph , 2012, KDD.

[36]  Thorsten Joachims,et al.  Making large-scale support vector machine learning practical , 1999 .

[37]  Seungyeop Han,et al.  Structured Learning of Gaussian Graphical Models , 2012, NIPS.

[38]  T. Ideker,et al.  Network-based classification of breast cancer metastasis , 2007, Molecular systems biology.

[39]  Shiqian Ma,et al.  Sparse Inverse Covariance Selection via Alternating Linearization Methods , 2010, NIPS.

[40]  Stephen J. Wright,et al.  Active Set Identification in Nonlinear Programming , 2006, SIAM J. Optim..

[41]  N. Tzourio-Mazoyer,et al.  Automated Anatomical Labeling of Activations in SPM Using a Macroscopic Anatomical Parcellation of the MNI MRI Single-Subject Brain , 2002, NeuroImage.

[42]  Pradeep Ravikumar,et al.  A Divide-and-Conquer Method for Sparse Inverse Covariance Estimation , 2012, NIPS.

[43]  Michael A. Saunders,et al.  Proximal Newton-type Methods for Minimizing Convex Objective Functions in Composite Form , 2012, NIPS 2012.

[44]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[45]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[46]  J. Nocedal,et al.  An inexact successive quadratic approximation method for L-1 regularized optimization , 2013, Mathematical Programming.

[47]  Kim-Chuan Toh,et al.  An Inexact Accelerated Proximal Gradient Method for Large Scale Linearly Constrained Convex SDP , 2012, SIAM J. Optim..

[48]  Kim-Chuan Toh,et al.  Solving Log-Determinant Optimization Problems by a Newton-CG Primal Proximal Point Algorithm , 2010, SIAM J. Optim..

[49]  Jieping Ye,et al.  Moreau-Yosida Regularization for Grouped Tree Structure Learning , 2010, NIPS.

[50]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[51]  R. Tibshirani,et al.  Sparsity and smoothness via the fused lasso , 2005 .

[52]  Pradeep Ravikumar,et al.  Sparse inverse covariance matrix estimation using quadratic approximation , 2011, MLSLP.

[53]  Katya Scheinberg,et al.  Practical inexact proximal quasi-Newton method with global complexity analysis , 2013, Mathematical Programming.

[54]  Jieping Ye,et al.  An efficient algorithm for a class of fused lasso problems , 2010, KDD.

[55]  Stephen J. Wright,et al.  Sparse Reconstruction by Separable Approximation , 2008, IEEE Transactions on Signal Processing.