Reliable Graph Discovery

A critical question in data mining is that can we always trust what discovered by a data mining system unconditionally? The answer is obviously not. If not, when can we trust the discovery then? What are the factors that affect the reliability of the discovery? How do they affect the reliability of the discovery? These are some interesting questions to be investigated. In this chapter we will firstly provide a definition and the measurements of reliability, and analyse the factors that affect the reliability. We then examine the impact of model complexity, weak links, varying sample sizes and the ability of different learners to the reliability of graphical model discovery. The experimental results reveal that (1) the larger sample size for the discovery, the higher reliability we will get; (2) the stronger a graph link is, the easier the discovery will be and thus the higher the reliability it can achieve; (3) the complexity of a graph also plays an important role in the discovery. The higher the complexity of a graph is, the more difficult to induce the graph and the lower reliability it would be. We also examined the performance difference of different discovery algorithms. This reveals the impact of discovery process. The experimental results show the superior reliability and robustness of MML method to standard significance tests in the recovery of graph links with small samples and weak links.

[1]  Douglas H. Fisher,et al.  Noise-Tolerant Conceptual Clustering , 1989, IJCAI.

[2]  D. Madigan,et al.  Bayesian model averaging and model selection for markov equivalence classes of acyclic digraphs , 1996 .

[3]  David Maxwell Chickering,et al.  A Transformational Characterization of Equivalent Bayesian Network Structures , 1995, UAI.

[4]  Kevin T. Kelly,et al.  Discovering Causal Structure. , 1989 .

[5]  Thomas G. Dietterich,et al.  Learning with Many Irrelevant Features , 1991, AAAI.

[6]  Gregory F. Cooper,et al.  A Bayesian Method for Constructing Bayesian Belief Networks from Databases , 1991, UAI.

[7]  Cullen Schaffer Sparse Data and the Effect of Overfitting Avoidance in Decision Tree Induction , 1992, AAAI.

[8]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[9]  Ross D. Shachter,et al.  A Decision-based View of Causality , 1994, UAI.

[10]  Richard Scheines,et al.  TETRAD II : tools for causal modeling , 1994 .

[11]  Cullen Schaffer Overfitting avoidance as bias , 2004, Machine Learning.

[12]  Kevin B. Korb,et al.  Causal Discovery via MML , 1996, ICML.

[13]  Larry A. Rendell,et al.  The Feature Selection Problem: Traditional Methods and a New Algorithm , 1992, AAAI.

[14]  Avron Barr,et al.  The Handbook of Artificial Intelligence , 1982 .

[15]  Cullen Schaffer When Does Overfitting Decrease Prediction Accuracy in Induced Decision Trees and Rule Sets? , 1991, EWSL.

[16]  S. Wright The Method of Path Coefficients , 1934 .

[17]  Maciej Modrzejewski,et al.  Feature Selection Using Rough Sets Theory , 1993, ECML.

[18]  David W. Aha,et al.  Noise-Tolerant Instance-Based Learning Algorithms , 1989, IJCAI.

[19]  Haym Hirsh,et al.  Classifier Learning from Noisy Data as Probabilistic Evidence Combination , 1992, AAAI.

[20]  Richard Scheines,et al.  Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling , 1987 .

[21]  Tim Niblett,et al.  Constructing Decision Trees in Noisy Domains , 1987, EWSL.

[22]  Glenn Shafer,et al.  Probabilistic expert systems , 1996, CBMS-NSF regional conference series in applied mathematics.

[23]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[24]  David Heckerman,et al.  A Bayesian Approach to Learning Causal Networks , 1995, UAI.