Statistical Inference for Pairwise Graphical Models Using Score Matching

Probabilistic graphical models have been widely used to model complex systems and aid scientific discoveries. As a result, there is a large body of literature focused on consistent model selection. However, scientists are often interested in understanding uncertainty associated with the estimated parameters, which current literature has not addressed thoroughly. In this paper, we propose a novel estimator for edge parameters for pairwise graphical models based on Hyv\"arinen scoring rule. Hyv\"arinen scoring rule is especially useful in cases where the normalizing constant cannot be obtained efficiently in a closed form. We prove that the estimator is $\sqrt{n}$-consistent and asymptotically Normal. This result allows us to construct confidence intervals for edge parameters, as well as, hypothesis tests. We establish our results under conditions that are typically assumed in the literature for consistent estimation. However, we do not require that the estimator consistently recovers the graph structure. In particular, we prove that the asymptotic distribution of the estimator is robust to model selection mistakes and uniformly valid for a large number of data-generating processes. We illustrate validity of our estimator through extensive simulation studies.

[1]  Aapo Hyvärinen,et al.  Some extensions of score matching , 2007, Comput. Stat. Data Anal..

[2]  Ali Shojaie,et al.  Selection and estimation for mixed graphical models. , 2013, Biometrika.

[3]  Steffen Lauritzen,et al.  Linear estimating equations for exponential families with application to Gaussian linear concentration models , 2013, 1311.0662.

[4]  Mladen Kolar,et al.  Learning structured densities via infinite dimensional exponential families , 2015, NIPS.

[5]  Weidong Liu Gaussian graphical model estimation with false discovery rate control , 2013, 1306.0976.

[6]  S. Geer,et al.  Confidence intervals for high-dimensional inverse covariance estimation , 2014, 1403.6752.

[7]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[8]  Trevor Hastie,et al.  Learning the Structure of Mixed Graphical Models , 2015, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[9]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[10]  Marloes H. Maathuis,et al.  Structure Learning in Graphical Modeling , 2016, 1606.02359.

[11]  H. Leeb,et al.  CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS? , 2003, Econometric Theory.

[12]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[13]  Harrison H. Zhou,et al.  Asymptotic normality and optimalities in estimation of large Gaussian graphical models , 2013, 1309.6024.

[14]  Harrison H. Zhou,et al.  Asymptotically Normal and Efficient Estimation of Covariate-Adjusted Gaussian Graphical Model , 2013, Journal of the American Statistical Association.

[15]  Mladen Kolar,et al.  ROCKET: Robust Confidence Intervals via Kendall's Tau for Transelliptical Graphical Models , 2015, The Annals of Statistics.

[16]  Martin J. Wainwright,et al.  A unified framework for high-dimensional analysis of $M$-estimators with decomposable regularizers , 2009, NIPS.

[17]  Weidong Liu,et al.  Fast and adaptive sparse precision matrix estimation in high dimensions , 2012, J. Multivar. Anal..

[18]  Aapo Hyvärinen,et al.  Estimation of Non-Normalized Statistical Models by Score Matching , 2005, J. Mach. Learn. Res..

[19]  Victor Chernozhukov,et al.  Inference on Treatment Effects after Selection Amongst High-Dimensional Controls , 2011 .

[20]  S. Lauritzen,et al.  Proper local scoring rules , 2011, 1101.5011.

[21]  Mladen Kolar,et al.  Inference for High-dimensional Exponential Family Graphical Models , 2016, AISTATS.

[22]  Prem S. Puri,et al.  On Optimal Asymptotic Tests of Composite Statistical Hypotheses , 1967 .

[23]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[24]  Benedikt M. Potscher,et al.  Confidence Sets Based on Sparse Estimators Are Necessarily Large , 2007, 0711.1036.

[25]  Pradeep Ravikumar,et al.  Mixed Graphical Models via Exponential Families , 2014, AISTATS.

[26]  Aapo Hyvärinen,et al.  Density Estimation in Infinite Dimensional Exponential Families , 2013, J. Mach. Learn. Res..

[27]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[28]  S. Portnoy Asymptotic Behavior of Likelihood Methods for Exponential Families when the Number of Parameters Tends to Infinity , 1988 .

[29]  M. Drton,et al.  Estimation of High-Dimensional Graphical Models Using Regularized Score Matching. , 2015, Electronic journal of statistics.

[30]  Shuheng Zhou,et al.  25th Annual Conference on Learning Theory Reconstruction from Anisotropic Random Measurements , 2022 .

[31]  B. Arnold,et al.  Conditional specification of statistical models , 1999 .

[32]  B. M. Pötscher,et al.  CAN ONE ESTIMATE THE UNCONDITIONAL DISTRIBUTION OF POST-MODEL-SELECTION ESTIMATORS? , 2007, Econometric Theory.

[33]  A. Belloni,et al.  Least Squares After Model Selection in High-Dimensional Sparse Models , 2009 .