Improving the Performance of Boosting for Naive Bayesian Classification

This paper investigates boosting naive Bayesian classification. It first shows that boosting cannot improve the accuracy of the naive Bayesian classifier on average in a set of natural domains. By analyzing the reasons of boosting's failures, we propose to introduce tree structures into naive Bayesian classification to improve the performance of boosting when working with naive Bayesian classification. The experimental results show that although introducing tree structures into naive Bayesian classification increases the average error of naive Bayesian classification for individual models, boosting naive Bayesian classifiers with tree structures can achieve significantly lower average error than the naive Bayesian classifier, providing a method of successfully applying the boosting technique to naive Bayesian classification.

[1]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[2]  Zijian Zheng Naive Bayesian Classiier Committees , 1998 .

[3]  Yoav Freund,et al.  Boosting the margin: A new explanation for the effectiveness of voting methods , 1997, ICML.

[4]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[5]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[6]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[7]  Peter E. Hart,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[8]  Leo Breiman,et al.  Bias, Variance , And Arcing Classifiers , 1996 .

[9]  M. Pazzani Constructive Induction of Cartesian Product Attributes , 1998 .

[10]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[11]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[12]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[13]  Bojan Cestnik,et al.  Estimating Probabilities: A Crucial Task in Machine Learning , 1990, ECAI.

[14]  Pat Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.

[15]  Pedro M. Domingos,et al.  Beyond Independence: Conditions for the Optimality of the Simple Bayesian Classifier , 1996, ICML.

[16]  Ivan Bratko,et al.  ASSISTANT 86: A Knowledge-Elicitation Tool for Sophisticated Users , 1987, EWSL.

[17]  Y. Chien,et al.  Pattern classification and scene analysis , 1974 .

[18]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[19]  Brian R. Gaines,et al.  Current Trends in Knowledge Acquisition , 1990 .

[20]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[21]  P. Langley,et al.  An Analysis of Bayesian Classifiers , 1992, AAAI.