Learning the Structure of Augmented Bayesian Classifiers

The naive Bayes classifier is built on the assumption of conditional independence between the attributes given the class. The algorithm has been shown to be surprisingly robust to obvious violations of this condition, but is is natural to ask if it is possible to further improve the accuracy by relaxing this assumption. We examine an approach where naive Bayes is augmented by the addition of correlation arcs between attributes. We explore two methods for finding the set of augmenting arcs, a greedy hill-climbing search, and a novel, more computationally efficient algorithm that we call SuperParent. We compare these methods to TAN; a state-of the-art distribution-based approach to finding the augmenting arcs.

[1]  Ron Kohavi Feature Subset Selection as Search with Probabilistic Estimates , 1994 .

[2]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[3]  Brian D. Ripley,et al.  Pattern Recognition and Neural Networks , 1996 .

[4]  Shlomo Zilberstein,et al.  Anytime algorithm development tools , 1996, SGAR.

[5]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[6]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[7]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[8]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[9]  Michael P. Wellman,et al.  State-Space Abstraction for Anytime Evaluation of Probabilistic Networks , 1994, UAI.

[10]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[11]  Michael J. Pazzani,et al.  Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.

[12]  Nir Friedman,et al.  Building Classifiers Using Bayesian Networks , 1996, AAAI/IAAI, Vol. 2.

[13]  Eric Horvitz,et al.  Reasoning, Metareasoning, and Mathematical Truth: Studies of Theorem Proving under Limited Resources , 1995, UAI.

[14]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[15]  Mehran Sahami,et al.  Learning Limited Dependence Bayesian Classifiers , 1996, KDD.

[16]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.