High-dimensional Ising model selection with Bayesian information criteria

We consider the use of Bayesian information criteria for selection of the graph underlying an Ising model. In an Ising model, the full conditional distributions of each variable form logistic regression models, and variable selection techniques for regression allow one to identify the neighborhood of each node and, thus, the entire graph. We prove high-dimensional consistency results for this pseudo-likelihood approach to graph selection when using Bayesian information criteria for the variable selection problems in the logistic regressions. The results pertain to scenarios of sparsity and following related prior work the information criteria we consider incorporate an explicit prior that encourages sparsity.

[1]  Martin J. Wainwright,et al.  Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions , 2009, IEEE Transactions on Information Theory.

[2]  N. Meinshausen,et al.  Stability selection , 2008, 0809.2932.

[3]  J. Besag Spatial Interaction and the Statistical Analysis of Lattice Systems , 1974 .

[4]  J. Ghosh,et al.  Modifying the Schwarz Bayesian Information Criterion to Locate Multiple Interacting Quantitative Trait Loci , 2004, Genetics.

[5]  Ali Jalali,et al.  On Learning Discrete Graphical Models using Greedy Methods , 2011, NIPS.

[6]  Zehua Chen,et al.  Selection Consistency of EBIC for GLIM with Non-canonical Links and Diverging Number of Parameters , 2011, 1112.2815.

[7]  Malgorzata Bogdan,et al.  Modified versions of the Bayesian Information Criterion for sparse Generalized Linear Models , 2011, Comput. Stat. Data Anal..

[8]  M. Drton,et al.  Bayesian model choice and information criteria in sparse generalized linear models , 2011, 1112.5635.

[9]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[10]  Erik Aurell,et al.  Frontiers in Computational Neuroscience , 2022 .

[11]  J. Laurie Snell,et al.  Markov Random Fields and Their Applications , 1980 .

[12]  V. Koltchinskii,et al.  Oracle inequalities in empirical risk minimization and sparse recovery problems , 2011 .

[13]  G. Shorack Probability for Statisticians , 2000 .

[14]  Jiahua Chen,et al.  Extended Bayesian information criteria for model selection with large model spaces , 2008 .

[15]  Karl W. Broman,et al.  A model selection approach for the identification of quantitative trait loci in experimental crosses , 2002 .

[16]  Vincent Y. F. Tan,et al.  High-dimensional structure estimation in Ising models: Local separation criterion , 2011, 1107.1736.

[17]  J. Besag Nearest‐Neighbour Systems and the Auto‐Logistic Model for Binary Data , 1972 .

[18]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[19]  Malgorzata Bogdan,et al.  Modified versions of Bayesian Information Criterion for genome-wide association studies , 2012, Comput. Stat. Data Anal..

[20]  Sara van de Geer,et al.  Statistics for High-Dimensional Data , 2011 .

[21]  G. Lorentz,et al.  Constructive approximation : advanced problems , 1996 .

[22]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[23]  Robert Tibshirani,et al.  Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods , 2009, J. Mach. Learn. Res..

[24]  Po-Ling Loh,et al.  Structure estimation for discrete graphical models: Generalized covariance matrices and their inverses , 2012, NIPS.

[25]  Rina Foygel Barber Prediction and model selection for high-dimensional data with sparse or low-rank structure , 2012 .

[26]  J. Snell,et al.  On the relation between Markov random fields and social networks , 1980 .

[27]  Zehua Chen,et al.  EXTENDED BIC FOR SMALL-n-LARGE-P SPARSE GLM , 2012 .

[28]  秀俊 松井,et al.  Statistics for High-Dimensional Data: Methods, Theory and Applications , 2014 .

[29]  James G. Scott,et al.  Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem , 2010, 1011.2333.

[30]  G. Schwarz Estimating the Dimension of a Model , 1978 .