Causal Learning From Predictive Modeling for Observational Data

We consider the problem of learning structured causal models from observational data. In this work, we use causal Bayesian networks to represent causal relationships among model variables. To this effect, we explore the use of two types of independencies—context-specific independence (CSI) and mutual independence (MI). We use CSI to identify the candidate set of causal relationships and then use MI to quantify their strengths and construct a causal model. We validate the learned models on benchmark networks and demonstrate the effectiveness when compared to some of the state-of-the-art Causal Bayesian Network Learning algorithms from observational Data.

[1]  Gregory F. Cooper,et al.  Causal Discovery from a Mixture of Experimental and Observational Data , 1999, UAI.

[2]  David Maxwell Chickering,et al.  Dependency Networks for Inference, Collaborative Filtering, and Data Visualization , 2000, J. Mach. Learn. Res..

[3]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[4]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[5]  Santtu Tikka,et al.  Identifying Causal Effects via Context-specific Independence Relations , 2020, NeurIPS.

[6]  Giorgos Borboudakis,et al.  Constraint-based causal discovery with mixed data , 2018, International Journal of Data Science and Analytics.

[7]  N. Pennington,et al.  Explanation-based decision making: effects of memory structure on judgment , 1988 .

[8]  Alain Hauser,et al.  Jointly interventional and observational data: estimation of interventional Markov equivalence classes of directed acyclic graphs , 2013, 1303.3216.

[9]  Michael D. Perlman,et al.  Enumerating Markov Equivalence Classes of Acyclic Digraph Models , 2001, UAI.

[10]  Jennifer Neville,et al.  Relational Dependency Networks , 2007, J. Mach. Learn. Res..

[11]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[12]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[13]  Jiji Zhang,et al.  On the completeness of orientation rules for causal discovery in the presence of latent confounders and selection bias , 2008, Artif. Intell..

[14]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[15]  Richard M. Karp,et al.  Reducibility Among Combinatorial Problems , 1972, 50 Years of Integer Programming.

[16]  Marco Scutari,et al.  Learning Bayesian Networks with the bnlearn R Package , 2009, 0908.3817.

[17]  Harry Zhang,et al.  A Fast Decision Tree Learning Algorithm , 2006, AAAI.

[18]  C. Sims Money, Income, and Causality , 1972 .

[19]  Clark Glymour,et al.  A million variables and more: the Fast Greedy Equivalence Search algorithm for learning high-dimensional graphical causal models, with an application to functional magnetic resonance images , 2016, International Journal of Data Science and Analytics.

[20]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[21]  P. Spirtes,et al.  An Algorithm for Fast Recovery of Sparse Causal Graphs , 1991 .

[22]  Zachary Chase Lipton The mythos of model interpretability , 2016, ACM Queue.

[23]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[24]  Isabelle Guyon,et al.  Design and Analysis of the Causation and Prediction Challenge , 2008, WCCI Causation and Prediction Challenge.

[25]  Victor Solo,et al.  On causality and mutual information , 2008, 2008 47th IEEE Conference on Decision and Control.

[26]  B. Schölkopf,et al.  Justifying Information-Geometric Causal Inference , 2014, 1402.2499.

[27]  N. Meinshausen,et al.  Methods for causal inference from gene perturbation experiments and validation , 2016, Proceedings of the National Academy of Sciences.

[28]  Dimitris Margaritis,et al.  Speculative Markov blanket discovery for optimal feature selection , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[29]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[30]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[31]  D. Fang,et al.  Temperature Rise Associated with Adiabatic Shear Band: Causality Clarified. , 2019, Physical review letters.

[32]  Thomas S. Richardson,et al.  Learning high-dimensional directed acyclic graphs with latent and selection variables , 2011, 1104.5617.

[33]  Tonio Ball,et al.  Causal and anti-causal learning in pattern recognition for neuroimaging , 2015, 2014 International Workshop on Pattern Recognition in Neuroimaging.

[34]  Norman Fenton,et al.  Risk Assessment and Decision Analysis with Bayesian Networks , 2012 .

[35]  Stanley H. Cohen,et al.  Design and Analysis , 2010 .

[36]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[37]  Gregory F. Cooper,et al.  Scoring Bayesian networks of mixed variables , 2018, International Journal of Data Science and Analytics.

[38]  Diego Colombo,et al.  A modification of the PC algorithm yielding order-independent skeletons , 2012, ArXiv.

[39]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[40]  Silvia Chiappa,et al.  A Causal Bayesian Networks Viewpoint on Fairness , 2018, Privacy and Identity Management.

[41]  David J. Spiegelhalter,et al.  Local computations with probabilities on graphical structures and their application to expert systems , 1990 .

[42]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[43]  Vincent Coumans,et al.  Mathematics Causal Discovery Algorithms and Real World Systems , 2017 .

[44]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[45]  Richard M. Karp,et al.  Reducibility among combinatorial problems" in complexity of computer computations , 1972 .

[46]  David Maxwell Chickering,et al.  Learning Bayesian Networks From Dependency Networks: A Preliminary Study , 2003, AISTATS.

[47]  P. Spirtes,et al.  Review of Causal Discovery Methods Based on Graphical Models , 2019, Front. Genet..

[48]  Max Henrion,et al.  Practical issues in constructing a Bayes belief network , 1987, Int. J. Approx. Reason..

[49]  Tomi Silander,et al.  A Simple Approach for Finding the Globally Optimal Bayesian Network Structure , 2006, UAI.

[50]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[51]  Diego Colombo,et al.  Order-independent constraint-based causal structure learning , 2012, J. Mach. Learn. Res..

[52]  Mo Yu,et al.  DAG-GNN: DAG Structure Learning with Graph Neural Networks , 2019, ICML.

[53]  A. Hasman,et al.  Probabilistic reasoning in intelligent systems: Networks of plausible inference , 1991 .

[54]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[55]  Donald M. Hassler,et al.  First Results of Tide SUMER Telescope and Spectrometer on SOHO , 1997 .

[56]  Qiang Ji,et al.  Local Causal Discovery of Direct Causes and Effects , 2015, NIPS.

[57]  Craig Boutilier,et al.  Context-Specific Independence in Bayesian Networks , 1996, UAI.

[58]  Christopher Meek,et al.  Causal inference and causal explanation with background knowledge , 1995, UAI.

[59]  Peter Spirtes,et al.  A Hybrid Causal Search Algorithm for Latent Variable Models , 2016, Probabilistic Graphical Models.

[60]  A. B. Kahn,et al.  Topological sorting of large networks , 1962, CACM.

[61]  Kristian Kersting,et al.  Gradient-based boosting for statistical relational learning: The relational dependency network case , 2011, Machine Learning.

[62]  S. Natarajan,et al.  Work-In-Progress : Ensemble Causal Learning for Modeling Post-Partum Depression , 2019 .

[63]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[64]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.