Causal Discovery from Streaming Features

In this paper, we study a new research problem of causal discovery from streaming features. A unique characteristic of streaming features is that not all features can be available before learning begins. Feature generation and selection often have to be interleaved. Managing streaming features has been extensively studied in classification, but little attention has been paid to the problem of causal discovery from streaming features. To this end, we propose a novel algorithm to solve this challenging problem, denoted as CDFSF (Causal Discovery From Streaming Features) which consists of two phases: growing and shrinking. In the growing phase, CDFSF finds candidate parents or children for each feature seen so far, while in the shrinking phase the algorithm dynamically removes false positives from the current sets of candidate parents and children. In order to improve the efficiency of CDFSF, we present S-CDFSF, a faster version of CDFSF, using two symmetry theorems. Experimental results validate our algorithms in comparison with other state-of-art algorithms of causal discovery.

[1]  John B. Shoven,et al.  I , Edinburgh Medical and Surgical Journal.

[2]  Gregory F. Cooper,et al.  Causal Discovery Using A Bayesian Local Causal Discovery Algorithm , 2004, MedInfo.

[3]  James Theiler,et al.  Online Feature Selection using Grafting , 2003, ICML.

[4]  Jesper Tegnér,et al.  Towards scalable and data efficient learning of Markov boundaries , 2007, Int. J. Approx. Reason..

[5]  André Elisseeff,et al.  Using Markov Blankets for Causal Structure Learning , 2008, J. Mach. Learn. Res..

[6]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[7]  Constantin F. Aliferis,et al.  Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery , 2003, METMBS.

[8]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[9]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[10]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[11]  Satoru Miyano,et al.  Optimal Search on Clustered Structural Constraint for Learning Bayesian Network Structure , 2010, J. Mach. Learn. Res..

[12]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[13]  C. Aliferis,et al.  Algorithms for Large-Scale Local Causal Discovery and Feature Selection In the Presence Of Limited Sample Or Large Causal Neighbourhoods , 2002 .

[14]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part I: Algorithms and Empirical Evaluation , 2010, J. Mach. Learn. Res..

[15]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[16]  James Theiler,et al.  Online feature selection for pixel classification , 2005, ICML.

[17]  Hao Wang,et al.  Online Streaming Feature Selection , 2010, ICML.

[18]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[19]  Constantin F. Aliferis,et al.  Local Causal and Markov Blanket Induction for Causal Discovery and Feature Selection for Classification Part II: Analysis and Extensions , 2010, J. Mach. Learn. Res..

[20]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[21]  Shimon Ullman,et al.  Learning to classify by ongoing feature selection , 2010, Image Vis. Comput..

[22]  Jing Zhou,et al.  Streamwise Feature Selection , 2006, J. Mach. Learn. Res..

[23]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[24]  Neal Madras,et al.  Strong Limit Theorems for the Bayesian Scoring Criterion in Bayesian Networks , 2009, J. Mach. Learn. Res..

[25]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .