Bayesian Algorithms for Causal Data Mining

We present two Bayesian algorithms CD-B and CD-H for discovering unconfounded cause and effect relationships from observational data without assuming causal sufficiency which precludes hidden common causes for the observed variables. The CD-B algorithm first estimates the Markov blanket of a node X using a Bayesian greedy search method and then applies Bayesian scoring methods to discriminate the parents and children of X. Using the set of parents and set of children CD-B constructs a global Bayesian network and outputs the causal effects of a node X based on the identification of Y arcs. Recall that if a node X has two parent nodes A,B and a child node C such that there is no arc between A,B and A,B are not parents of C, then the arc from X to C is called a Y arc. The CD-H algorithm uses the MMPC algorithm to estimate the union of parents and children of a target node X. The subsequent steps are similar to those of CD-B. We evaluated the CD-B and CD-H algorithms empirically based on simulated data from four different Bayesian networks. We also present comparative results based on the identification of Y structures and Y arcs from the output of the PC, MMHC and FCI algorithms. The results appear promising for mining causal relationships that are unconfounded by hidden variables from observational data.

[1]  Max Henrion,et al.  Propagating uncertainty in bayesian networks by probabilistic logic sampling , 1986, UAI.

[2]  Gregory F. Cooper,et al.  An overview of the representation and discovery of causal relationships using Bayesian networks , 1999 .

[3]  Gregory F. Cooper,et al.  Causal Discovery Using A Bayesian Local Causal Discovery Algorithm , 2004, MedInfo.

[4]  Tom Burr,et al.  Causation, Prediction, and Search , 2003, Technometrics.

[5]  Constantin F. Aliferis,et al.  Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery , 2003, METMBS.

[6]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[8]  Steen Andreassen,et al.  MUNIN - A Causal Probabilistic Network for Interpretation of Electromyographic Findings , 1987, IJCAI.

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  Gregory F. Cooper,et al.  A Simple Constraint-Based Algorithm for Efficiently Mining Observational Databases for Causal Relationships , 1997, Data Mining and Knowledge Discovery.

[11]  Andrew W. Moore,et al.  Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning , 2003, ICML.

[12]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[13]  Kristian Kristensen,et al.  The use of a Bayesian network in the design of a decision support system for growing malting barley without use of pesticides , 2002 .

[14]  Gregory F. Cooper,et al.  A Theoretical Study of Y Structures for Causal Discovery , 2006, UAI.

[15]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[16]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[17]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[18]  A. H. Murphy,et al.  Hailfinder: A Bayesian system for forecasting severe weather , 1996 .

[19]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[20]  Gregory F. Cooper,et al.  The ALARM Monitoring System: A Case Study with two Probabilistic Inference Techniques for Belief Networks , 1989, AIME.

[21]  Gregory F. Cooper,et al.  A bayesian local causal discovery framework , 2005 .