Generalized Direct Change Estimation in Ising Model Structure

We consider the problem of estimating change in the dependency structure between two $p$-dimensional Ising models, based on respectively $n_1$ and $n_2$ samples drawn from the models. The change is assumed to be structured, e.g., sparse, block sparse, node-perturbed sparse, etc., such that it can be characterized by a suitable (atomic) norm. We present and analyze a norm-regularized estimator for directly estimating the change in structure, without having to estimate the structures of the individual Ising models. The estimator can work with any norm, and can be generalized to other graphical models under mild assumptions. We show that only one set of samples, say $n_2$, needs to satisfy the sample complexity requirement for the estimator to work, and the estimation error decreases as $\frac{c}{\sqrt{\min(n_1,n_2)}}$, where $c$ depends on the Gaussian width of the unit norm ball. For example, for $\ell_1$ norm applied to $s$-sparse change, the change can be accurately estimated with $\min(n_1,n_2)=O(s \log p)$ which is sharper than an existing result $n_1= O(s^2 \log p)$ and $n_2 = O(n_1^2)$. Experimental results illustrating the effectiveness of the proposed estimator are presented.

[1]  Arindam Banerjee,et al.  Structured Estimation with Atomic Norms: General Bounds and Applications , 2015, NIPS.

[2]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[3]  M. Talagrand Majorizing measures: the generic chaining , 1996 .

[4]  Motoaki Kawanabe,et al.  Direct Importance Estimation with Model Selection and Its Application to Covariate Shift Adaptation , 2007, NIPS.

[5]  M. Talagrand,et al.  Probability in Banach Spaces: Isoperimetry and Processes , 1991 .

[6]  Yoram Singer,et al.  Efficient Learning using Forward-Backward Splitting , 2009, NIPS.

[7]  Mark W. Schmidt,et al.  Group Sparse Priors for Covariance Estimation , 2009, UAI.

[8]  Gábor Lugosi,et al.  Concentration Inequalities - A Nonasymptotic Theory of Independence , 2013, Concentration Inequalities.

[9]  Pablo A. Parrilo,et al.  The Convex Geometry of Linear Inverse Problems , 2010, Foundations of Computational Mathematics.

[10]  Su-In Lee,et al.  Node-based learning of multiple Gaussian graphical models , 2013, J. Mach. Learn. Res..

[11]  Masashi Sugiyama,et al.  Support Consistency of Direct Sparse-Change Learning in Markov Networks , 2015, AAAI.

[12]  M. Talagrand Majorizing measures without measures , 2001 .

[13]  Stephen P. Boyd,et al.  Proximal Algorithms , 2013, Found. Trends Optim..

[14]  J. Lafferty,et al.  High-dimensional Ising model selection using ℓ1-regularized logistic regression , 2010, 1010.0311.

[15]  Vidyashankar Sivakumar,et al.  Estimation with Norm Regularization , 2014, NIPS.

[16]  Alexandre d'Aspremont,et al.  Model Selection Through Sparse Max Likelihood Estimation Model Selection Through Sparse Maximum Likelihood Estimation for Multivariate Gaussian or Binary Data , 2022 .

[17]  S. Mendelson,et al.  Reconstruction and Subgaussian Operators in Asymptotic Geometric Analysis , 2007 .

[18]  Masashi Sugiyama,et al.  Direct Learning of Sparse Changes in Markov Networks by Density Ratio Estimation , 2013, Neural Computation.

[19]  Pradeep Ravikumar,et al.  Graphical Models via Generalized Linear Models , 2012, NIPS.

[20]  M. Talagrand The Generic Chaining , 2005 .

[21]  Y. Gordon On Milman's inequality and random subspaces which escape through a mesh in ℝ n , 1988 .

[22]  Karsten M. Borgwardt,et al.  Covariate Shift by Kernel Mean Matching , 2009, NIPS 2009.

[23]  Rauf Izmailov,et al.  Statistical Inference Problems and Their Rigorous Solutions - In memory of Alexey Chervonenkis , 2015, SLDS.

[24]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[25]  Marc Teboulle,et al.  A Fast Iterative Shrinkage-Thresholding Algorithm for Linear Inverse Problems , 2009, SIAM J. Imaging Sci..

[26]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[27]  Takafumi Kanamori,et al.  A Least-squares Approach to Direct Importance Estimation , 2009, J. Mach. Learn. Res..

[28]  Martin J. Wainwright,et al.  Sharp Thresholds for High-Dimensional and Noisy Sparsity Recovery Using $\ell _{1}$ -Constrained Quadratic Programming (Lasso) , 2009, IEEE Transactions on Information Theory.

[29]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[30]  R. Vershynin Estimation in High Dimensions: A Geometric Perspective , 2014, 1405.5103.

[31]  Takafumi Kanamori,et al.  Density Ratio Estimation in Machine Learning , 2012 .

[32]  Yurii Nesterov,et al.  Smooth minimization of non-smooth functions , 2005, Math. Program..

[33]  M. Talagrand Upper and Lower Bounds for Stochastic Processes , 2021, Ergebnisse der Mathematik und ihrer Grenzgebiete. 3. Folge / A Series of Modern Surveys in Mathematics.

[34]  Jieping Ye,et al.  Efficient Methods for Overlapping Group Lasso , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[35]  Bin Yu,et al.  High-dimensional covariance estimation by minimizing ℓ1-penalized log-determinant divergence , 2008, 0811.3628.