Selection and estimation for mixed graphical models.

We consider the problem of estimating the parameters in a pairwise graphical model in which the distribution of each node, conditioned on the others, may have a different exponential family form. We identify restrictions on the parameter space required for the existence of a well-defined joint density, and establish the consistency of the neighbourhood selection approach for graph reconstruction in high dimensions when the true underlying graph is sparse. Motivated by our theoretical results, we investigate the selection of edges between nodes whose conditional distributions take different parametric forms, and show that efficiency can be gained if edge estimates obtained from the regressions of particular nodes are used to reconstruct the graph. These results are illustrated with examples of Gaussian, Bernoulli, Poisson and exponential distributions. Our theoretical findings are corroborated by evidence from simulation studies.

[1]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[2]  Trevor Hastie,et al.  Learning the Structure of Mixed Graphical Models , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[3]  Hongzhe Li,et al.  High‐Dimensional Heteroscedastic Regression with an Application to eQTL Data Analysis , 2012, Biometrics.

[4]  Tianxi Li,et al.  High-Dimensional Mixed Graphical Models , 2013, 1304.2810.

[5]  Larry A. Wasserman,et al.  The Nonparanormal: Semiparametric Estimation of High Dimensional Undirected Graphs , 2009, J. Mach. Learn. Res..

[6]  Hongzhe Li,et al.  Robust Gaussian Graphical Modeling Via l1 Penalization , 2012, Biometrics.

[7]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[8]  Jonathan E. Taylor,et al.  On model selection consistency of penalized M-estimators: a geometric theory , 2013, NIPS.

[9]  D. Vogel,et al.  Elliptical graphical modelling , 2011, 1506.04321.

[10]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[11]  Mathias Drton,et al.  Robust graphical modeling of gene networks using classical and alternative t-distributions , 2010, 1009.3669.

[12]  Peter Bühlmann,et al.  Stable graphical model estimation with Random Forests for discrete, continuous, and mixed variables , 2011, Comput. Stat. Data Anal..

[13]  Roland Fried,et al.  On Robust Gaussian Graphical Modeling , 2010 .

[14]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[15]  Edward H. Ip,et al.  Conditionally specified continuous distributions , 2008 .

[16]  G. Wahba,et al.  Multivariate Bernoulli distribution , 2012, 1206.1874.

[17]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[18]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[19]  Daphne Koller,et al.  Efficient Structure Learning of Markov Networks using L1-Regularization , 2006, NIPS.

[20]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[21]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.

[22]  M. Wainwright,et al.  HIGH-DIMENSIONAL COVARIANCE ESTIMATION BY MINIMIZING l1-PENALIZED LOG-DETERMINANT DIVERGENCE BY PRADEEP RAVIKUMAR , 2009 .

[23]  Pei Wang,et al.  Partial Correlation Estimation by Joint Sparse Regression Models , 2008, Journal of the American Statistical Association.

[24]  Trevor J. Hastie,et al.  Learning Mixed Graphical Models , 2012, ArXiv.

[25]  Pradeep Ravikumar,et al.  Variational Chernoff Bounds for Graphical Models , 2004, UAI.

[26]  Pradeep Ravikumar,et al.  Graphical Models via Generalized Linear Models , 2012, NIPS.

[27]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[28]  Ali Jalali,et al.  On Learning Discrete Graphical Models using Group-Sparse Regularization , 2011, AISTATS.

[29]  Martin J. Wainwright,et al.  Sharp thresholds for high-dimensional and noisy recovery of sparsity , 2006, ArXiv.

[30]  Robert Castelo,et al.  Mapping eQTL Networks with Mixed Graphical Markov Models , 2014, Genetics.

[31]  Hongzhe Li,et al.  A penalized likelihood approach for bivariate conditional normal models for dynamic co-expression analysis. , 2011, Biometrics.

[32]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[33]  Robert Castelo,et al.  Mapping eQTL networks with mixed graphical models , 2014 .

[34]  Pradeep Ravikumar,et al.  Mixed Graphical Models via Exponential Families , 2014, AISTATS.

[35]  Jianqing Fan,et al.  New Estimation and Model Selection Procedures for Semiparametric Modeling in Longitudinal Data Analysis , 2004 .

[36]  F. Bunea Honest variable selection in linear and logistic regression models via $\ell_1$ and $\ell_1+\ell_2$ penalization , 2008, 0808.4051.

[37]  Ali Shojaie,et al.  Graph Estimation with Joint Additive Models. , 2013, Biometrika.

[38]  S. Geer HIGH-DIMENSIONAL GENERALIZED LINEAR MODELS AND THE LASSO , 2008, 0804.0703.

[39]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[40]  R. Tibshirani,et al.  On the “degrees of freedom” of the lasso , 2007, 0712.0881.

[41]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[42]  H. Zou,et al.  Regularized rank-based estimation of high-dimensional nonparanormal graphical models , 2012, 1302.3082.

[43]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .

[44]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[45]  Michael I. Jordan Graphical Models , 2003 .

[46]  Genevera I. Allen,et al.  A Log-Linear Graphical Model for inferring genetic networks from high-throughput sequencing data , 2012, 2012 IEEE International Conference on Bioinformatics and Biomedicine.

[47]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[48]  Robert Tibshirani,et al.  Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods , 2009, J. Mach. Learn. Res..

[49]  Martin J. Wainwright,et al.  High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2006, NIPS.