Towards scalable and data efficient learning of Markov boundaries

We propose algorithms for learning Markov boundaries from data without having to learn a Bayesian network first. We study their correctness, scalability and data efficiency. The last two properties are important because we aim to apply the algorithms to identify the minimal set of features that is needed for probabilistic classification in databases with thousands of features but few instances, e.g. gene expression databases. We evaluate the algorithms on synthetic and real databases, including one with 139,351 features.

[1]  C.J.H. Mann,et al.  Probabilistic Conditional Independence Structures , 2005 .

[2]  David Page,et al.  KDD Cup 2001 report , 2002, SKDD.

[3]  Constantin F. Aliferis,et al.  HITON: A Novel Markov Blanket Algorithm for Optimal Variable Selection , 2003, AMIA.

[4]  David Maxwell Chickering,et al.  Finding Optimal Bayesian Networks , 2002, UAI.

[5]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[6]  Edward H. Herskovits,et al.  Computer-based probabilistic-network construction , 1992 .

[7]  Milan Studený,et al.  Probabilistic conditional independence structures , 2006, Information science and statistics.

[8]  Bernhard Schölkopf,et al.  Feature selection and transduction for prediction of molecular bioactivity for drug design , 2003, Bioinform..

[9]  Jesper Tegnér,et al.  Scalable, Efficient and Correct Learning of Markov Boundaries Under the Faithfulness Assumption , 2005, ECSQARU.

[10]  Constantin F. Aliferis,et al.  Time and sample efficient discovery of Markov blankets and direct causal relations , 2003, KDD '03.

[11]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[12]  Franz von Kutschera,et al.  Causation , 1993, J. Philos. Log..

[13]  Sebastian Thrun,et al.  Bayesian Network Induction via Local Neighborhoods , 1999, NIPS.

[14]  Paola Sebastiani,et al.  Statistical Challenges in Functional Genomics , 2003 .

[15]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[16]  Milan Studeny Probabilistic Conditional Independence Structures: With 42 Illustrations (Information Science and Statistics) , 2004 .

[17]  Bernhard Schölkopf,et al.  Kernel Constrained Covariance for Dependence Measurement , 2005, AISTATS.

[18]  Claus Skaanning Blocking Gibbs Sampling for Inference in Large and Complex Bayesian Networks with Applications in Genetics , 1997 .

[19]  Bernhard Schölkopf,et al.  Kernel Methods for Measuring Independence , 2005, J. Mach. Learn. Res..

[20]  Andrew Y. Ng,et al.  Preventing "Overfitting" of Cross-Validation Data , 1997, ICML.

[21]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[22]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[23]  Gary Carpenter 동적 사용자를 위한 Scalable 인증 그룹 키 교환 프로토콜 , 2005 .

[24]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[25]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[26]  Jesper Tegnér,et al.  Growing Bayesian network models of gene networks from seed genes , 2005, ECCB/JBI.