Parallel ensemble methods for causal direction inference

Inferring the causal direction between two variables from their observation data is one of the most fundamental and challenging topics in data science. A causal direction inference algorithm maps the observation data into a binary value which represents either x causes y or y causes x. The nature of these algorithms makes the results unstable with the change of data points. Therefore the accuracy of the causal direction inference can be improved significantly by using parallel ensemble frameworks. In this paper, new causal direction inference algorithms based on several ways of parallel ensemble are proposed. Theoretical analyses on accuracy rates are given. Experiments are done on both of the artificial data sets and the real world data sets. The accuracy performances of the methods and their computational efficiencies in parallel computing environment are demonstrated.

[1]  Michael D. Perlman,et al.  Enumerating Markov Equivalence Classes of Acyclic Digraph Models , 2001, UAI.

[2]  A. Sliva,et al.  Modeling Causal Relationships in Sociocultural Systems Using Ensemble Methods , 2017 .

[3]  George K. Karagiannidis,et al.  Efficient Machine Learning for Big Data: A Review , 2015, Big Data Res..

[4]  Zhi-Hua Zhou,et al.  Ensemble Methods: Foundations and Algorithms , 2012 .

[5]  Jilles Vreeken,et al.  Telling cause from effect by local and global regression , 2018, Knowledge and Information Systems.

[6]  Bernhard Schölkopf,et al.  Distinguishing Cause from Effect Using Observational Data: Methods and Benchmarks , 2014, J. Mach. Learn. Res..

[7]  W. Hoeffding Probability Inequalities for sums of Bounded Random Variables , 1963 .

[8]  Bernhard Schölkopf,et al.  Causal Inference on Discrete Data Using Additive Noise Models , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Bernhard Schölkopf,et al.  Cause-Effect Inference by Comparing Regression Errors , 2018, AISTATS.

[10]  Guiming Luo,et al.  Causal direction inference for network alarm analysis , 2018 .

[11]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[12]  Susan Athey,et al.  Ensemble Methods for Causal Effects in Panel Data Settings , 2019, AEA Papers and Proceedings.

[13]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[14]  Bernhard Schölkopf,et al.  Causal discovery with continuous additive noise models , 2013, J. Mach. Learn. Res..

[15]  Dominik Janzing,et al.  The Cause-Effect Problem: Motivation, Ideas, and Popular Misconceptions , 2019, Cause Effect Pairs in Machine Learning.

[16]  D. Childers,et al.  Anticipating global terrestrial ecosystem state change using FLUXNET , 2019, Global change biology.

[17]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[18]  Michèle Sebag,et al.  Learning Functional Causal Models with Generative Neural Networks , 2018 .

[19]  B. Schölkopf,et al.  Justifying Information-Geometric Causal Inference , 2014, 1402.2499.

[20]  Tie-Yan Liu,et al.  LightGBM: A Highly Efficient Gradient Boosting Decision Tree , 2017, NIPS.

[21]  Gaurav Sharma,et al.  MATLAB®: A Language for Parallel Computing , 2009, International Journal of Parallel Programming.

[22]  Bernhard Schölkopf,et al.  On Estimation of Functional Causal Models , 2015, ACM Trans. Intell. Syst. Technol..

[23]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[24]  J. R. Quinlan Induction of decision trees , 2004, Machine Learning.