A Unified Framework for Structured Graph Learning via Spectral Constraints

Graph learning from data represents a canonical problem that has received substantial attention in the literature. However, insufficient work has been done in incorporating prior structural knowledge onto the learning of underlying graphical models from data. Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. Useful structured graphs include the multi-component graph, bipartite graph, connected graph, sparse graph, and regular graph. In general, structured graph learning is an NP-hard combinatorial problem, therefore, designing a general tractable optimization method is extremely challenging. In this paper, we introduce a unified graph learning framework lying at the integration of Gaussian graphical models and spectral graph theory. To impose a particular structure on a graph, we first show how to formulate the combinatorial constraints as an analytical property of the graph matrix. Then we develop an optimization framework that leverages graph learning with specific structures via spectral constraints on graph matrices. The proposed algorithms are provably convergent, computationally efficient, and practically amenable for numerous graph-based tasks. Extensive numerical experiments with both synthetic and real data sets illustrate the effectiveness of the proposed algorithms. The code for all the simulations is made available as an open source repository.

[1]  Seungjin Choi,et al.  Clustering with r-regular graphs , 2009, Pattern Recognit..

[2]  Larry A. Wasserman,et al.  The huge Package for High-dimensional Undirected Graph Estimation in R , 2012, J. Mach. Learn. Res..

[3]  Stephen Gould,et al.  Projected Subgradient Methods for Learning Sparse Gaussians , 2008, UAI.

[4]  Levent Tunçel,et al.  Optimization algorithms on matrix manifolds , 2009, Math. Comput..

[5]  Ali Shojaie,et al.  The cluster graphical lasso for improved estimation of Gaussian graphical models , 2013, Comput. Stat. Data Anal..

[6]  Antonio Ortega,et al.  Generalized Laplacian precision matrix estimation for graph signal processing , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Inderjit S. Dhillon,et al.  Matrix Nearness Problems with Bregman Divergences , 2007, SIAM J. Matrix Anal. Appl..

[8]  Gary L. Miller,et al.  Combinatorial preconditioners and multilevel solvers for problems in computer vision and image processing , 2009, Comput. Vis. Image Underst..

[9]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[10]  Ronny Luss,et al.  Generalized Isotonic Regression , 2011, 1104.1779.

[11]  Santiago Segarra,et al.  Network Topology Inference from Spectral Templates , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[12]  Trevor J. Hastie,et al.  The Graphical Lasso: New Insights and Alternatives , 2011, Electronic journal of statistics.

[13]  Michael I. Jordan,et al.  On Spectral Clustering: Analysis and an algorithm , 2001, NIPS.

[14]  Hai Wang,et al.  Inferring Block Structure of Graphical Models in Exponential Families , 2015, AISTATS.

[15]  Feiping Nie,et al.  Learning A Structured Optimal Bipartite Graph for Co-Clustering , 2017, NIPS.

[16]  Guang Cheng,et al.  Simultaneous Clustering and Estimation of Heterogeneous Graphical Models , 2016, J. Mach. Learn. Res..

[17]  U. Schulte Constructing trees in bipartite graphs , 1996, Discret. Math..

[18]  Vassilis Kalofolias,et al.  How to Learn a Graph from Smooth Signals , 2016, AISTATS.

[19]  Vincent Y. F. Tan,et al.  High-dimensional Gaussian graphical model selection: walk summability and local separation criterion , 2011, J. Mach. Learn. Res..

[20]  Chris H. Q. Ding,et al.  Bipartite graph partitioning and data clustering , 2001, CIKM '01.

[21]  U. Feige,et al.  Spectral Graph Theory , 2015 .

[22]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[23]  Prabhu Babu,et al.  Sparse Generalized Eigenvalue Problem Via Smooth Optimization , 2014, IEEE Transactions on Signal Processing.

[24]  Elchanan Mossel,et al.  The Complexity of Distinguishing Markov Random Fields , 2008, APPROX-RANDOM.

[25]  Feiping Nie,et al.  The Constrained Laplacian Rank Algorithm for Graph-Based Clustering , 2016, AAAI.

[26]  I. Segal,et al.  What Makes Them Click: Empirical Analysis of Consumer Demand for Search Advertising , 2012 .

[27]  Jianqing Fan,et al.  Sparsistency and Rates of Convergence in Large Covariance Matrix Estimation. , 2007, Annals of statistics.

[28]  Thomas S. Richardson,et al.  Graphical Methods for Efficient Likelihood Inference in Gaussian Covariance Models , 2007, J. Mach. Learn. Res..

[29]  Pierre Vandergheynst,et al.  Spectrally approximating large graphs with smaller graphs , 2018, ICML.

[30]  Alexander J. Smola,et al.  Kernels and Regularization on Graphs , 2003, COLT.

[31]  Ruifang Liu,et al.  SOME RESULTS ON THE LARGEST AND LEAST EIGENVALUES OF GRAPHS , 2014 .

[32]  Licheng Zhao,et al.  Optimization Algorithms for Graph Laplacian Estimation via ADMM and MM , 2019, IEEE Transactions on Signal Processing.

[33]  Salar Fattahi,et al.  Graphical Lasso and Thresholding: Equivalence and Closed-form Solutions , 2017, J. Mach. Learn. Res..

[34]  Guillermo Sapiro,et al.  Topology Constraints in Graphical Models , 2012, NIPS.

[35]  Stephen P. Boyd,et al.  CVXR: An R Package for Disciplined Convex Optimization , 2017, Journal of Statistical Software.

[36]  Sergio Barbarossa,et al.  Graph Topology Inference Based on Sparsifying Transform Learning , 2018, IEEE Transactions on Signal Processing.

[37]  Bo Wang,et al.  Network enhancement as a general method to denoise weighted biological networks , 2018, Nature Communications.

[38]  Geert Leus,et al.  State-Space Network Topology Identification From Partial Observations , 2019, IEEE Transactions on Signal and Information Processing over Networks.

[39]  Pascal Frossard,et al.  Graph learning under sparsity priors , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[40]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[41]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[42]  Xavier Bresson,et al.  Convolutional Neural Networks on Graphs with Fast Localized Spectral Filtering , 2016, NIPS.

[43]  Shang-Hua Teng,et al.  Spectral Sparsification of Graphs , 2008, SIAM J. Comput..

[44]  Ali Shojaie,et al.  Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. , 2009, Biometrika.

[45]  Ali Shojaie,et al.  Penalized Principal Component Regression on Graphs for Analysis of Subnetworks , 2010, NIPS.

[46]  Prabhu Babu,et al.  Majorization-Minimization Algorithms in Signal Processing, Communications, and Machine Learning , 2017, IEEE Transactions on Signal Processing.

[47]  Pascal Frossard,et al.  The emerging field of signal processing on graphs: Extending high-dimensional data analysis to networks and other irregular domains , 2012, IEEE Signal Processing Magazine.

[48]  Antonio Ortega,et al.  Transform-Based Distributed Data Gathering , 2009, IEEE Transactions on Signal Processing.

[49]  Ali Shojaie,et al.  Network granger causality with inherent grouping structure , 2012, J. Mach. Learn. Res..

[50]  Huazhong Yang,et al.  Bounds of spectral radii of weighted trees , 2012 .

[51]  Hongzhe Li,et al.  Joint Estimation of Multiple High-dimensional Precision Matrices. , 2016, Statistica Sinica.

[52]  Kevin P. Murphy,et al.  Sparse Gaussian graphical models with unknown block structure , 2009, ICML '09.

[53]  Abraham Berman,et al.  A LOWER BOUND FOR THE SECOND LARGEST LAPLACIAN EIGENVALUE OF WEIGHTED GRAPHS , 2011 .

[54]  Junhui Wang,et al.  Joint estimation of sparse multivariate regression and conditional graphical models , 2013, ArXiv.

[55]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[56]  Piet Van Mieghem,et al.  Graph Spectra for Complex Networks , 2010 .

[57]  Pierre Vandergheynst,et al.  Graph Signal Processing: Overview, Challenges, and Applications , 2017, Proceedings of the IEEE.

[58]  Mei Lu,et al.  Lower bounds on the (Laplacian) spectral radius of weighted graphs , 2014 .

[59]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[60]  Michael J. Best,et al.  Active set algorithms for isotonic regression; A unifying framework , 1990, Math. Program..

[61]  Eduardo Pavez,et al.  Learning Graphs With Monotone Topology Properties and Multiple Connected Components , 2017, IEEE Transactions on Signal Processing.

[62]  Charu C. Aggarwal,et al.  Graph Clustering , 2010, Encyclopedia of Machine Learning and Data Mining.

[63]  Ameya Prabhu,et al.  Deep Expander Networks: Efficient Deep Networks from Graph Theory , 2017, ECCV.

[64]  Magnus Jansson,et al.  A connectedness constraint for learning sparse graphs , 2017, 2017 25th European Signal Processing Conference (EUSIPCO).

[65]  Joshua B. Tenenbaum,et al.  Discovering Structure by Learning Sparse Graphs , 2010 .

[66]  Georgios A. Pavlopoulos,et al.  Bipartite graphs in systems biology and medicine: a survey of methods and applications , 2018, GigaScience.

[67]  P. P. Vaidyanathan,et al.  Uncertainty Principles and Sparse Eigenvectors of Graphs , 2017, IEEE Transactions on Signal Processing.

[68]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[69]  Qiang Liu,et al.  Learning Scale Free Networks by Reweighted L1 regularization , 2011, AISTATS.

[70]  Zhi-Quan Luo,et al.  A Unified Convergence Analysis of Block Successive Minimization Methods for Nonsmooth Optimization , 2012, SIAM J. Optim..

[71]  Daniel,et al.  Default Probability , 2004 .

[72]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[73]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[74]  Prabhu Babu,et al.  Orthogonal Sparse PCA and Covariance Estimation via Procrustes Reformulation , 2016, IEEE Transactions on Signal Processing.

[75]  Chu-in Charles Lee,et al.  The Quadratic Loss of Isotonic Regression Under Normality , 1981 .

[76]  Stephen P. Boyd,et al.  Enhancing Sparsity by Reweighted ℓ1 Minimization , 2007, 0711.1612.

[77]  Daniel Pérez Palomar,et al.  Structured Graph Learning Via Laplacian Spectral Constraints , 2019, NeurIPS.

[78]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[79]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[80]  Sunil K. Narang,et al.  Perfect Reconstruction Two-Channel Wavelet Filter Banks for Graph Structured Data , 2011, IEEE Transactions on Signal Processing.

[81]  Mila Nikolova,et al.  Analysis of Half-Quadratic Minimization Methods for Signal and Image Recovery , 2005, SIAM J. Sci. Comput..

[82]  Alfred O. Hero,et al.  Learning Latent Variable Gaussian Graphical Models , 2014, ICML.

[83]  Leonhard Held,et al.  Gaussian Markov Random Fields: Theory and Applications , 2005 .

[84]  Su-In Lee,et al.  Node-based learning of multiple Gaussian graphical models , 2013, J. Mach. Learn. Res..

[85]  David P. Williamson,et al.  Improved approximation algorithms for maximum cut and satisfiability problems using semidefinite programming , 1995, JACM.

[86]  Pablo A. Parrilo,et al.  Latent variable graphical model selection via convex optimization , 2010, 2010 48th Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[87]  Sandeep Kumar,et al.  Stochastic Multidimensional Scaling , 2016, IEEE Transactions on Signal and Information Processing over Networks.

[88]  George Michailidis,et al.  Estimation of Graphical Models through Structured Norm Minimization , 2016, J. Mach. Learn. Res..

[89]  Xiaotong Shen,et al.  Journal of the American Statistical Association Likelihood-based Selection and Sharp Parameter Estimation Likelihood-based Selection and Sharp Parameter Estimation , 2022 .

[90]  Christophe Ambroise,et al.  Inferring sparse Gaussian graphical models with latent structure , 2008, 0810.3177.

[91]  Wei Shi,et al.  Expander graph and communication-efficient decentralized optimization , 2016, 2016 50th Asilomar Conference on Signals, Systems and Computers.

[92]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[93]  R. Bapat,et al.  A sharp upper bound on the largest Laplacian eigenvalue of weighted graphs , 2005 .

[94]  Zhong Chen,et al.  Vandermonde Factorization of Hankel Matrix for Complex Exponential Signal Recovery—Application in Fast NMR Spectroscopy , 2018, IEEE Transactions on Signal Processing.

[95]  Michael I. Jordan Graphical Models , 2003 .

[96]  Eva Nosal,et al.  Eigenvalues of graphs , 1970 .

[97]  H. D. Brunk,et al.  The Isotonic Regression Problem and its Dual , 1972 .

[98]  Matthias Hein,et al.  Estimation of positive definite M-matrices and structure learning for attractive Gaussian Markov Random fields , 2014, 1404.6640.

[99]  Ido Kaminer,et al.  Upper bound for the Laplacian eigenvalues of a graph , 2011 .

[100]  Bert Huang Maximum Likelihood Graph Structure Estimation with Degree Distributions , 2008 .

[101]  Achi Brandt,et al.  Lean Algebraic Multigrid (LAMG): Fast Graph Laplacian Linear Solver , 2011, SIAM J. Sci. Comput..

[102]  Yufeng Liu,et al.  Joint estimation of multiple precision matrices with common structures , 2015, J. Mach. Learn. Res..

[103]  Inderjit S. Dhillon,et al.  Co-clustering documents and words using bipartite spectral graph partitioning , 2001, KDD '01.

[104]  B. McKay,et al.  Constructing cospectral graphs , 1982 .

[105]  Antti Honkela,et al.  On the inconsistency of ℓ1-penalised sparse precision matrix estimation , 2016, BMC Bioinformatics.

[106]  Nikos D. Sidiropoulos,et al.  Scalable and Flexible Multiview MAX-VAR Canonical Correlation Analysis , 2016, IEEE Transactions on Signal Processing.

[107]  Adrian E. Raftery,et al.  Bayesian Regularization for Normal Mixture Estimation and Model-Based Clustering , 2007, J. Classif..

[108]  Ronald R. Coifman,et al.  Multiscale Wavelets on Trees, Graphs and High Dimensional Data: Theory and Applications to Semi Supervised Learning , 2010, ICML.

[109]  T. Terlaky,et al.  The linear complimentarity problem, sufficient matrices, and the criss-cross method , 1993 .

[110]  Antonio Ortega,et al.  Graph Learning From Data Under Laplacian and Structural Constraints , 2016, IEEE Journal of Selected Topics in Signal Processing.

[111]  B. Mohar Some applications of Laplace eigenvalues of graphs , 1997 .