Efficient Learning of Quadratic Variance Function Directed Acyclic Graphs via Topological Layers

Directed acyclic graph (DAG) models are widely used to represent casual relationships among random variables in many application domains. This paper studies a special class of non-Gaussian DAG models, where the conditional variance of each node given its parents is a quadratic function of its conditional mean. Such a class of non-Gaussian DAG models are fairly flexible and admit many popular distributions as special cases, including Poisson, Binomial, Geometric, Exponential, and Gamma. To facilitate learning, we introduce a novel concept of topological layers, and develop an efficient DAG learning algorithm. It first reconstructs the topological layers in a hierarchical fashion and then recoveries the directed edges between nodes in different layers, which requires much less computational cost than most existing algorithms in literature. Its advantage is also demonstrated in a number of simulated examples, as well as its applications to two real-life datasets, including an NBA player statistics data and a cosmetic sales data collected by Alibaba.

[1]  P. Erdos,et al.  On the evolution of random graphs , 1984 .

[2]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[3]  Peter Bühlmann,et al.  Causal Inference Using Graphical Models with the R Package pcalg , 2012 .

[4]  Gunwoong Park,et al.  High-Dimensional Poisson Structural Equation Model Learning via $\ell_1$-Regularized Regression , 2018, J. Mach. Learn. Res..

[5]  Andrew D. Sanford,et al.  A Bayesian network structure for operational risk modelling in structured finance operations , 2012, J. Oper. Res. Soc..

[6]  Pradeep Ravikumar,et al.  DAGs with NO TEARS: Continuous Optimization for Structure Learning , 2018, NeurIPS.

[7]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[8]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[9]  Aapo Hyvärinen,et al.  DirectLiNGAM: A Direct Method for Learning a Linear Non-Gaussian Structural Equation Model , 2011, J. Mach. Learn. Res..

[10]  Alain Hauser,et al.  High-dimensional consistency in score-based and hybrid structure learning , 2015, The Annals of Statistics.

[11]  Wei Sun,et al.  Consistent selection of tuning parameters via variable selection stability , 2012, J. Mach. Learn. Res..

[12]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[13]  Peter Bühlmann,et al.  Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm , 2007, J. Mach. Learn. Res..

[14]  N. Meinshausen,et al.  High-dimensional graphs and variable selection with the Lasso , 2006, math/0608017.

[15]  C. Morris Natural Exponential Families with Quadratic Variance Functions , 1982 .

[16]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[17]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[18]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[19]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[20]  Nir Friedman,et al.  Probabilistic Graphical Models - Principles and Techniques , 2009 .

[21]  Xiaotong Shen,et al.  Constrained likelihood for reconstructing a directed acyclic Gaussian graph. , 2018, Biometrika.

[22]  On causal discovery with an equal-variance assumption , 2018, Biometrika.

[23]  J. Peters,et al.  Identifiability of Gaussian structural equation models with equal error variances , 2012, 1205.2536.

[24]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[25]  T. Tony Cai,et al.  Nonparametric regression in exponential families , 2010, 1010.3836.

[26]  Peter Bühlmann,et al.  CAM: Causal Additive Models, high-dimensional order search and penalized regression , 2013, ArXiv.

[27]  Zhitang Chen,et al.  Causal Discovery with Reinforcement Learning , 2019, ICLR.

[28]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[29]  M. Drton,et al.  High-dimensional causal discovery under non-Gaussianity , 2018, Biometrika.

[30]  Pradeep Ravikumar,et al.  Graphical models via univariate exponential family distributions , 2013, J. Mach. Learn. Res..

[31]  Xin-She Yang,et al.  Introduction to Algorithms , 2021, Nature-Inspired Optimization Algorithms.