A Flexible Framework for Nonparametric Graphical Modeling that Accommodates Machine Learning

Graphical modeling has been broadly useful for exploring the dependence structure among features in a dataset. However, the strength of graphical modeling hinges on our ability to encode and estimate conditional dependencies. In particular, commonly used measures such as partial correlation are only meaningful under strongly parametric (in this case, multivariate Gaussian) assumptions. These assumptions are unverifiable, and there is often little reason to believe they hold in practice. In this paper, we instead consider 3 nonparametric measures of conditional dependence. These measures are meaningful without structural assumptions on the multivariate distribution of the data. In addition, we show that for 2 of these measures there are simple, strong plug-in estimators that require only the estimation of a conditional mean. These plug-in estimators (1) are asymptotically linear and non-parametrically efficient, (2) allow incorporation of flexible machine learning techniques for conditional mean estimation, and (3) enable the construction of valid Wald-type confidence intervals. In addition, by leveraging the influence function of these estimators, one can obtain intervals with simultaneous coverage guarantees for all pairs of features.

[1]  T. Bedford,et al.  Vines: A new graphical model for dependent random variables , 2002 .

[2]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[3]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[4]  Edward H. Kennedy Semiparametric theory and empirical processes in causal inference , 2015, 1510.04740.

[5]  Bernhard Schölkopf,et al.  A Permutation-Based Kernel Conditional Independence Test , 2014, UAI.

[6]  Edward H Kennedy,et al.  Non‐parametric methods for doubly robust estimation of continuous treatment effects , 2015, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[7]  Furong Gao,et al.  Investigating Local Dependence With Conditional Covariance Functions , 1998 .

[8]  T. Cai,et al.  A Constrained ℓ1 Minimization Approach to Sparse Precision Matrix Estimation , 2011, 1102.2233.

[9]  Péter Csermely,et al.  The efficiency of multi-target drugs: the network approach might help drug design. , 2004, Trends in pharmacological sciences.

[10]  Fang Han,et al.  Transelliptical Graphical Models , 2012, NIPS.

[11]  Jennifer L. Hill,et al.  Bayesian Nonparametric Modeling for Causal Inference , 2011 .

[12]  Ron Shamir,et al.  Clustering Gene Expression Patterns , 1999, J. Comput. Biol..

[13]  Trevor Hastie,et al.  Statistical Learning with Sparsity: The Lasso and Generalizations , 2015 .

[14]  Sascha O. Becker,et al.  Estimation of Average Treatment Effects Based on Propensity Scores , 2002 .

[15]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[16]  Bernhard Schölkopf,et al.  A kernel-based causal learning algorithm , 2007, ICML '07.

[17]  L. Fernholz von Mises Calculus For Statistical Functionals , 1983 .

[18]  Caroline Uhler,et al.  Gaussian Graphical Models: An Algebraic and Geometric Perspective , 2017, 1707.04345.

[19]  Heping Zhang,et al.  Conditional Distance Correlation , 2015, Journal of the American Statistical Association.

[20]  Alexandros G. Dimakis,et al.  Model-Powered Conditional Independence Test , 2017, NIPS.

[21]  R. Tibshirani,et al.  Generalized Additive Models , 1986 .

[22]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[23]  Bernhard Schölkopf,et al.  Kernel-based Conditional Independence Test and Application in Causal Discovery , 2011, UAI.

[24]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[25]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[26]  Keying Ye,et al.  Applied Bayesian Modeling and Causal Inference From Incomplete-Data Perspectives , 2005, Technometrics.

[27]  G. Imbens,et al.  The Propensity Score with Continuous Treatments , 2005 .

[28]  Local Polynomial Smoothing , 2006 .

[29]  Whitney K. Newey,et al.  Cross-fitting and fast remainder rates for semiparametric estimation , 2017, 1801.09138.

[30]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[31]  Mladen Kolar,et al.  ROCKET: Robust Confidence Intervals via Kendall's Tau for Transelliptical Graphical Models , 2015, The Annals of Statistics.

[32]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[33]  Cun-Hui Zhang,et al.  Confidence intervals for low dimensional parameters in high dimensional linear models , 2011, 1110.2563.

[34]  Hiroyuki Toh,et al.  Inference of a genetic network by a combined approach of cluster analysis and graphical Gaussian modeling , 2002, Bioinform..

[35]  Bernhard Schölkopf,et al.  Kernel Measures of Conditional Dependence , 2007, NIPS.

[36]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[37]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[38]  George C. Verghese,et al.  Graph similarity scoring and matching , 2008, Appl. Math. Lett..

[39]  R. Tibshirani,et al.  Additive models with trend filtering , 2017, The Annals of Statistics.

[40]  D. Rubinfeld,et al.  Hedonic housing prices and the demand for clean air , 1978 .

[41]  J. Schmee An Introduction to Multivariate Statistical Analysis , 1986 .

[42]  Ravi Iyengar,et al.  Network analyses in systems pharmacology , 2009, Bioinform..

[43]  Francis R. Bach,et al.  Breaking the Curse of Dimensionality with Convex Neural Networks , 2014, J. Mach. Learn. Res..

[44]  G. Oehlert A note on the delta method , 1992 .

[45]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[46]  N. Wermuth,et al.  On Substantive Research Hypotheses, Conditional Independence Graphs and Graphical Chain Models , 1990 .

[47]  Qingqiu Gong,et al.  An Arabidopsis gene network based on the graphical Gaussian model. , 2007, Genome research.

[48]  Scott T. Weiss,et al.  A graphical model approach for inferring large-scale networks integrating gene expression and genetic polymorphism , 2009, BMC Systems Biology.

[49]  S. Kotz,et al.  Correlation and dependence , 2001 .

[50]  C. Tappert,et al.  A Survey of Binary Similarity and Distance Measures , 2010 .

[51]  Bernhard Schölkopf,et al.  Measuring Statistical Dependence with Hilbert-Schmidt Norms , 2005, ALT.

[52]  S. Dudoit,et al.  Multiple Testing Procedures with Applications to Genomics , 2007 .

[53]  M. Sobel Asymptotic Confidence Intervals for Indirect Effects in Structural Equation Models , 1982 .

[54]  Marco Carone,et al.  Nonparametric variable importance assessment using machine learning techniques , 2020, Biometrics.

[55]  Maria L. Rizzo,et al.  Measuring and testing dependence by correlation of distances , 2007, 0803.4101.

[56]  S. Huang,et al.  Shape-dependent control of cell growth, differentiation, and apoptosis: switching between attractors in cell regulatory networks. , 2000, Experimental cell research.

[57]  J. Robins,et al.  Adjusting for Nonignorable Drop-Out Using Semiparametric Nonresponse Models , 1999 .

[58]  Jinbo Bi,et al.  Dimensionality Reduction via Sparse Support Vector Machines , 2003, J. Mach. Learn. Res..

[59]  James M. Robins,et al.  Unified Methods for Censored Longitudinal Data and Causality , 2003 .

[60]  Megan F. Cole,et al.  Core Transcriptional Regulatory Circuitry in Human Embryonic Stem Cells , 2005, Cell.

[61]  MendesPedro,et al.  Discovery of meaningful associations in genomic data using partial correlation coefficients , 2004 .

[62]  M. J. van der Laan,et al.  The International Journal of Biostatistics Targeted Maximum Likelihood Learning , 2011 .

[63]  R. Shibata,et al.  PARTIAL CORRELATION AND CONDITIONAL CORRELATION AS MEASURES OF CONDITIONAL INDEPENDENCE , 2004 .

[64]  Anja Vogler,et al.  An Introduction to Multivariate Statistical Analysis , 2004 .

[65]  M. Yuan,et al.  Model selection and estimation in the Gaussian graphical model , 2007 .