Discovering Reliable Causal Rules

We study the problem of deriving policies, or rules, that when enacted on a complex system, cause a desired outcome. Absent the ability to perform controlled experiments, such rules have to be inferred from past observations of the system's behaviour. This is a challenging problem for two reasons: First, observational effects are often unrepresentative of the underlying causal effect because they are skewed by the presence of confounding factors. Second, naive empirical estimations of a rule's effect have a high variance, and, hence, their maximisation can lead to random results. To address these issues, first we measure the causal effect of a rule from observational data---adjusting for the effect of potential confounders. Importantly, we provide a graphical criteria under which causal rule discovery is possible. Moreover, to discover reliable causal rules from a sample, we propose a conservative and consistent estimator of the causal effect, and derive an efficient and exact algorithm that maximises the estimator. On synthetic data, the proposed estimator converges faster to the ground truth than the naive estimator and recovers relevant causal rules even at small sample sizes. Extensive experiments on a variety of real-world datasets show that the proposed algorithm is efficient and discovers meaningful rules.

[1]  Henrik Grosskreutz,et al.  Non-redundant Subgroup Discovery Using a Closure System , 2009, ECML/PKDD.

[2]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[3]  Karsten M. Borgwardt,et al.  Finding significant combinations of features in the presence of categorical covariates , 2016, NIPS.

[4]  Cynthia Rudin,et al.  Causal Falling Rule Lists , 2015, ArXiv.

[5]  Jinyan Li,et al.  Efficient mining of emerging patterns: discovering trends and differences , 1999, KDD '99.

[6]  W. Marsden I and J , 2012 .

[7]  Marc Ratkovic,et al.  Estimating treatment effect heterogeneity in randomized program evaluation , 2013, 1305.5682.

[8]  Nicholas I. Fisher,et al.  Bump hunting in high-dimensional data , 1999, Stat. Comput..

[9]  Jilles Vreeken,et al.  Efficiently Discovering Locally Exceptional Yet Globally Representative Subgroups , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[10]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[11]  J. Pearl,et al.  Causal inference , 2011, Twenty-one Mental Models That Can Change Policing.

[12]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[13]  Peter Spirtes,et al.  Introduction to Causal Inference , 2010, J. Mach. Learn. Res..

[14]  Stefan Wrobel,et al.  An Algorithm for Multi-relational Discovery of Subgroups , 1997, PKDD.

[15]  Mohammad Saraee,et al.  Causality-based cost-effective action mining , 2013, Intell. Data Anal..

[16]  Frank Puppe,et al.  A Knowledge-Intensive Approach for Semi-automatic Causal Subgroup Discovery , 2009, Knowledge Discovery Enhanced with Semantic and Social Information.

[17]  Johannes Fürnkranz,et al.  Foundations of Rule Learning , 2012, Cognitive Technologies.

[18]  Peter A. Flach,et al.  Subgroup Discovery with CN2-SD , 2004, J. Mach. Learn. Res..

[19]  Jiuyong Li,et al.  From Observational Studies to Causal Rule Mining , 2015, ACM Trans. Intell. Syst. Technol..

[20]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[21]  Rajeev Motwani,et al.  Scalable Techniques for Mining Causal Structures , 1998, Data Mining and Knowledge Discovery.

[22]  R. Mike Cameron-Jones,et al.  Induction of logic programs: FOIL and related systems , 1995, New Generation Computing.

[23]  Jianping Li,et al.  On the complexity of finding emerging patterns , 2004, Proceedings of the 28th Annual International Computer Software and Applications Conference, 2004. COMPSAC 2004..

[24]  Stefan Wrobel,et al.  Tight Optimistic Estimates for Fast Subgroup Discovery , 2008, ECML/PKDD.

[25]  Johannes Fürnkranz,et al.  ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[26]  Judea Pearl,et al.  Causal Inference , 2010 .

[27]  Stephen D. Bay,et al.  Detecting Group Differences: Mining Contrast Sets , 2001, Data Mining and Knowledge Discovery.

[28]  Peter A. Flach,et al.  Rule Evaluation Measures: A Unifying View , 1999, ILP.