Data Analysis with Bayesian Networks: A Bootstrap Approach

In recent years there has been significant progress in algorithms and methods for inducing Bayesian networks from data. However, in complex data analysis problems, we need to go beyond being satisfied with inducing networks with high scores. We need to provide confidence measures on features of these networks: Is the existence of an edge between two nodes warranted? Is the Markov blanket of a given node robust? Can we say something about the ordering of the variables? We should be able to address these questions, even when the amount of data is not enough to induce a high scoring network. In this paper we propose Efron's Bootstrap as a computationally efficient approach for answering these questions. In addition, we propose to use these confidence measures to induce better structures from the data, and to detect the presence of latent variables.

[1]  David Maxwell Chickering,et al.  Learning Bayesian Networks is NP-Complete , 2016, AISTATS.

[2]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[3]  D. Lockhart,et al.  Expression monitoring by hybridization to high-density oligonucleotide arrays , 1996, Nature Biotechnology.

[4]  Nir Friedman,et al.  On the application of the bootstrap for computing confidence measures on features of induced Bayesian networks , 1999, AISTATS.

[5]  David J. Spiegelhalter,et al.  Sequential Model Criticism in Probabilistic Expert Systems , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[7]  Wray L. Buntine A Guide to the Literature on Learning Probabilistic Networks from Data , 1996, IEEE Trans. Knowl. Data Eng..

[8]  Christopher Meek,et al.  Causal inference and causal explanation with background knowledge , 1995, UAI.

[9]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[10]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[11]  T. Vámos,et al.  Judea pearl: Probabilistic reasoning in intelligent systems , 1992, Decision Support Systems.

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems , 1988 .

[13]  E. Lander Array of hope , 1999, Nature Genetics.

[14]  Eric Beattie,et al.  Alarm monitoring system , 2001 .

[15]  Judea Pearl,et al.  A Theory of Inferred Causation , 1991, KR.

[16]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[17]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[18]  Thorsten Joachims,et al.  A Probabilistic Analysis of the Rocchio Algorithm with TFIDF for Text Categorization , 1997, ICML.

[19]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.