On the classification performance of TAN and general Bayesian networks

Over a decade ago, Friedman et al. introduced the Tree Augmented Naive Bayes (TAN) classifier, with experiments indicating that it significantly outperformed Naive Bayes (NB) in terms of classification accuracy, whereas general Bayesian network (GBN) classifiers performed no better than NB. This paper challenges those claims, using a careful experimental analysis to show that GBN classifiers significantly outperform NB on datasets analyzed, and are comparable to TAN performance. It is found that the poor performance reported by Friedman et al. are not attributable to the GBN per se, but rather to their use of simple empirical frequencies to estimate GBN parameters, whereas basic parameter smoothing (used in their TAN analyses but not their GBN analyses) improves GBN performance significantly. It is concluded that, while GBN classifiers may have some limitations, they deserve greater attention, particularly in domains where insight into classification decisions, as well as good accuracy, is required.

[1]  Charles X. Ling,et al.  The Representational Power of Discrete Bayesian Networks , 2002, J. Mach. Learn. Res..

[2]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[3]  Remco R. Bouckaert,et al.  Estimating replicability of classifier learning experiments , 2004, ICML.

[4]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[5]  Michael G. Madden,et al.  The Performance of Bayesian Network Classifiers Constructed using Different Techniques , 2003 .

[6]  Ron Kohavi,et al.  Data Mining Using MLC a Machine Learning Library in C++ , 1996, Int. J. Artif. Intell. Tools.

[7]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[8]  Pedro M. Domingos,et al.  Learning Bayesian network classifiers by maximizing conditional likelihood , 2004, ICML.

[9]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[10]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[11]  David Maxwell Chickering,et al.  On the incompatibility of faithfulness and monotone DAG faithfulness , 2006, Artif. Intell..

[12]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[13]  Russell Greiner,et al.  Learning Bayesian Belief Network Classifiers: Algorithms and System , 2001, Canadian Conference on AI.

[14]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[15]  Ron Kohavi,et al.  Data Mining using MLC , 1996 .

[16]  Richard E. Neapolitan,et al.  Learning Bayesian networks , 2007, KDD '07.

[17]  Eamonn J. Keogh,et al.  Learning the Structure of Augmented Bayesian Classifiers , 2002, Int. J. Artif. Intell. Tools.

[18]  Tomi Silander,et al.  A Simple Approach for Finding the Globally Optimal Bayesian Network Structure , 2006, UAI.

[19]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[20]  Weiru Liu,et al.  Learning belief networks from data: an information theory based approach , 1997, CIKM '97.

[21]  David A. Bell,et al.  Learning Bayesian networks from data: An information-theory based approach , 2002, Artif. Intell..

[22]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[23]  Dan Roth,et al.  Understanding Probabilistic Classifiers , 2001, ECML.

[24]  Gregory F. Cooper,et al.  The Computational Complexity of Probabilistic Inference Using Bayesian Belief Networks , 1990, Artif. Intell..

[25]  Jan Vanthienen,et al.  Learning Bayesian network classifiers for credit scoring using Markov chain Monte Carlo search , 2002, Object recognition supported by user interaction for service robots.

[26]  David Maxwell Chickering,et al.  Optimal Structure Identification With Greedy Search , 2002, J. Mach. Learn. Res..

[27]  Charles X. Ling,et al.  An Improved Learning Algorithm for Augmented Naive Bayes , 2001, PAKDD.

[28]  Ramón López de Mántaras,et al.  TAN Classifiers Based on Decomposable Distributions , 2005, Machine Learning.