Simultaneous Clustering and Estimation of Heterogeneous Graphical Models

We consider joint estimation of multiple graphical models arising from heterogeneous and high-dimensional observations. Unlike most previous approaches which assume that the cluster structure is given in advance, an appealing feature of our method is to learn cluster structure while estimating heterogeneous graphical models. This is achieved via a high dimensional version of Expectation Conditional Maximization (ECM) algorithm (Meng and Rubin, 1993). A joint graphical lasso penalty is imposed on the conditional maximization step to extract both homogeneity and heterogeneity components across all clusters. Our algorithm is computationally efficient due to fast sparse learning routines and can be implemented without unsupervised learning knowledge. The superior performance of our method is demonstrated by extensive experiments and its application to a Glioblastoma cancer dataset reveals some new insights in understanding the Glioblastoma cancer. In theory, a non-asymptotic error bound is established for the output directly from our high dimensional ECM algorithm, and it consists of two quantities: statistical error (statistical accuracy) and optimization error (computational complexity). Such a result gives a theoretical guideline in terminating our ECM iterations.

[1]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[2]  Charles R. Johnson,et al.  Matrix analysis , 1985, Statistical Inference for Engineers and Data Scientists.

[3]  Xiao-Li Meng,et al.  Maximum likelihood estimation via the ECM algorithm: A general framework , 1993 .

[4]  Jiahua Chen Optimal Rate of Convergence for Finite Mixture Models , 1995 .

[5]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[6]  Jianqing Fan,et al.  Sure independence screening for ultrahigh dimensional feature space , 2006, math/0612857.

[7]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[8]  Wei Pan,et al.  Penalized Model-Based Clustering with Application to Variable Selection , 2007, J. Mach. Learn. Res..

[9]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[10]  Dan Klein,et al.  Fully distributed EM for very large datasets , 2008, ICML '08.

[11]  Wen Zhang,et al.  How much can behavioral targeting help online advertising? , 2009, WWW '09.

[12]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[13]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[14]  Xiaotong Shen,et al.  Penalized model-based clustering with unconstrained covariance matrices. , 2009, Electronic journal of statistics.

[15]  Network exploration via the adaptive LASSO and SCAD penalties , 2009 .

[16]  J. Uhm Comprehensive genomic characterization defines human glioblastoma genes and core pathways , 2009 .

[17]  John F. Canny,et al.  Large-scale behavioral targeting , 2009, KDD.

[18]  Jianqing Fan,et al.  NETWORK EXPLORATION VIA THE ADAPTIVE LASSO AND SCAD PENALTIES. , 2009, The annals of applied statistics.

[19]  P. Bickel,et al.  Covariance regularization by thresholding , 2009, 0901.3079.

[20]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[21]  Ali Shojaie,et al.  Penalized Principal Component Regression on Graphs for Analysis of Subnetworks , 2010, NIPS.

[22]  Ali Shojaie,et al.  Penalized likelihood methods for estimation of sparse high-dimensional directed acyclic graphs. , 2009, Biometrika.

[23]  Maria-Florina Balcan,et al.  Robust hierarchical clustering , 2013, J. Mach. Learn. Res..

[24]  Junhui Wang Consistent selection of the number of clusters via crossvalidation , 2010 .

[25]  E. Levina,et al.  Joint estimation of multiple graphical models. , 2011, Biometrika.

[26]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[27]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[28]  Roman Vershynin,et al.  Introduction to the non-asymptotic analysis of random matrices , 2010, Compressed Sensing.

[29]  Wei Sun,et al.  Regularized k-means clustering of high-dimensional data and its asymptotic consistency , 2012 .

[30]  I. Segal,et al.  What Makes Them Click: Empirical Analysis of Consumer Demand for Search Advertising , 2012 .

[31]  Xiaotong Shen,et al.  Journal of the American Statistical Association Likelihood-based Selection and Sharp Parameter Estimation Likelihood-based Selection and Sharp Parameter Estimation , 2022 .

[32]  Harrison H. Zhou,et al.  Estimating Sparse Precision Matrix: Optimal Rates of Convergence and Adaptive Estimation , 2012, 1212.2882.

[33]  Tuo Zhao,et al.  Sparse Inverse Covariance Estimation with Calibration , 2013, NIPS.

[34]  X. Nguyen Convergence of latent mixing measures in finite and infinite mixture models , 2011, 1109.3250.

[35]  Junhui Wang,et al.  Joint estimation of sparse multivariate regression and conditional graphical models , 2013, ArXiv.

[36]  M. Wainwright Structured Regularizers for High-Dimensional Problems: Statistical and Computational Issues , 2014 .

[37]  Adam J. Rothman,et al.  On the existence of the weighted bridge penalized Gaussian likelihood precision matrix estimator , 2014 .

[38]  Martin J. Wainwright,et al.  Statistical guarantees for the EM algorithm: From population to sample-based analysis , 2014, ArXiv.

[39]  Xi Chen,et al.  Spectral Methods Meet EM: A Provably Optimal Algorithm for Crowdsourcing , 2014, J. Mach. Learn. Res..

[40]  Patrick Danaher,et al.  The joint graphical lasso for inverse covariance estimation across multiple classes , 2011, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[41]  Xiaotong Shen,et al.  Structural Pursuit Over Multiple Undirected Graphs , 2014, Journal of the American Statistical Association.

[42]  Yufeng Liu,et al.  Joint estimation of multiple precision matrices with common structures , 2015, J. Mach. Learn. Res..

[43]  Nhat Ho,et al.  Identifiability and optimal rates of convergence for parameters of multiple types in finite mixtures , 2015, 1501.02497.

[44]  Jian Yang,et al.  Robust Tree-based Causal Inference for Complex Ad Effectiveness Analysis , 2015, WSDM.

[45]  Zhaoran Wang,et al.  High Dimensional EM Algorithm: Statistical Optimization and Asymptotic Normality , 2015, NIPS.

[46]  Christine B Peterson,et al.  Bayesian Inference of Multiple Gaussian Graphical Models , 2015, Journal of the American Statistical Association.

[47]  Constantine Caramanis,et al.  Regularized EM Algorithms: A Unified Framework and Statistical Guarantees , 2015, NIPS.

[48]  Alexis Boukouvalas,et al.  What to Do When K-Means Clustering Fails: A Simple yet Principled Alternative Algorithm , 2016, PloS one.

[49]  Han Liu,et al.  Joint estimation of multiple graphical models from high dimensional time series , 2013, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[50]  Xiaotong Shen,et al.  Estimation of multiple networks in Gaussian mixture models. , 2016, Electronic journal of statistics.

[51]  Hongzhe Li,et al.  Joint Estimation of Multiple High-dimensional Precision Matrices. , 2016, Statistica Sinica.

[52]  Takumi Saegusa,et al.  Joint Estimation of Precision Matrices in Heterogeneous Populations. , 2016, Electronic journal of statistics.

[53]  Yong He,et al.  Joint estimation of multiple high‐dimensional Gaussian copula graphical models , 2017 .

[54]  Yucheng Dong,et al.  A Unified Framework , 2018, Linguistic Decision Making.