On Model Selection Consistency of Lasso for High-Dimensional Ising Models on Tree-like Graphs

We consider the problem of high-dimensional Ising model selection using neighborhood based least absolute shrinkage and selection operator (Lasso). It is rigorously proved that under some mild coherence conditions on the population covariance matrix of the Ising model, consistent model selection can be achieved with sample sizes n = Ω(d log p) for any tree-like graph in the paramagnetic phase, where p is the number of variables and d is the maximum node degree. When the same conditions are imposed directly on the sample covariance matrices, it is shown that a reduced sample size n = Ω(d log p) suffices. The obtained sufficient conditions for consistent model selection with Lasso are the same in the scaling of the sample complexity as that of `1-regularized logistic regression. Given the popularity and efficiency of Lasso, our rigorous analysis provides a theoretical backing for its practical use in Ising model selection.

[1]  Martin J. Wainwright,et al.  Information-Theoretic Limits of Selecting Binary Graphical Models in High Dimensions , 2009, IEEE Transactions on Information Theory.

[2]  J. Berg,et al.  Bethe–Peierls approximation and the inverse Ising problem , 2011, 1112.3501.

[3]  M. Opper,et al.  Advanced mean field methods: theory and practice , 2001 .

[4]  Hilbert J. Kappen,et al.  Efficient Learning in Boltzmann Machines Using Linear Response Theory , 1998, Neural Computation.

[5]  西森 秀稔 Statistical physics of spin glasses and information processing : an introduction , 2001 .

[6]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[7]  Peng Zhao,et al.  On Model Selection Consistency of Lasso , 2006, J. Mach. Learn. Res..

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  Roman Vershynin,et al.  High-Dimensional Probability , 2018 .

[10]  Michael Chertkov,et al.  Optimal structure and parameter learning of Ising models , 2016, Science Advances.

[11]  F. Ricci-Tersenghi The Bethe approximation for solving the inverse Ising problem: a comparison with other inference methods , 2011, 1112.4814.

[12]  J. Besag Statistical Analysis of Non-Lattice Data , 1975 .

[13]  Martin J. Wainwright,et al.  High-Dimensional Graphical Model Selection Using ℓ1-Regularized Logistic Regression , 2006, NIPS.

[14]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[15]  Andrey Y. Lokhov,et al.  Exponential Reduction in Sample Complexity with Learning of Ising Model Dynamics , 2021, ICML.

[16]  E. Ising Beitrag zur Theorie des Ferromagnetismus , 1925 .

[17]  C. Sander,et al.  Direct-coupling analysis of residue coevolution captures native contacts across many protein families , 2011, Proceedings of the National Academy of Sciences.

[18]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[19]  Michael J. Black,et al.  Fields of Experts: a framework for learning image priors , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[20]  Attila Szolnoki,et al.  Statistical Physics of Human Cooperation , 2017, ArXiv.

[21]  O. Papaspiliopoulos High-Dimensional Probability: An Introduction with Applications in Data Science , 2020 .

[22]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[23]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[24]  Adam J. Rothman,et al.  Sparse permutation invariant covariance estimation , 2008, 0801.4837.

[25]  Xiangming Meng,et al.  Structure Learning in Inverse Ising Problems Using 𝓁2-Regularized Linear Estimator , 2020, ArXiv.

[26]  Toshiyuki TANAKA Mean-field theory of Boltzmann machine learning , 1998 .

[27]  Jure Leskovec,et al.  Learning to Discover Social Circles in Ego Networks , 2012, NIPS.

[28]  M. Mézard,et al.  Spin Glass Theory And Beyond: An Introduction To The Replica Method And Its Applications , 1986 .

[29]  Edoardo Di Napoli,et al.  A modified Ising model of Barabási–Albert network with gene-type spins , 2019, Journal of Mathematical Biology.

[30]  S. Kak Information, physics, and computation , 1996 .

[31]  Michael I. Jordan,et al.  Graphical Models, Exponential Families, and Variational Inference , 2008, Found. Trends Mach. Learn..

[32]  Federico Ricci-Tersenghi,et al.  Pseudolikelihood decimation algorithm improving the inference of the interaction network in a general class of Ising models. , 2013, Physical review letters.

[33]  Robert Tibshirani,et al.  Estimation of Sparse Binary Pairwise Markov Networks using Pseudo-likelihoods , 2009, J. Mach. Learn. Res..

[34]  Michael Chertkov,et al.  Interaction Screening: Efficient and Sample-Optimal Learning of Ising Models , 2016, NIPS.

[35]  M. Zacharias,et al.  Accurate modeling of DNA conformational flexibility by a multivariate Ising model , 2021, Proceedings of the National Academy of Sciences.