Increasing the Classification Accuracy of Simple Bayesian Classifier

Simple Bayes algorithm captures the assumption that every feature is independent from the rest of the features, given the state of the class feature. The fact that the assumption of independence is clearly almost always wrong has led to a general rejection of the crude independence model in favor of more complicated alternatives, at least by researchers knowledgeable about theoretical issues. In this study, we attempted to increase the prediction accuracy of the simple Bayes model. Because the concept of combining classifiers is proposed as a new direction for the improvement of the performance of individual classifiers, we made use of Adaboost, with the difference that in each iteration of Adaboost, we used a discretization method and we removed redundant features using a filter feature selection method. Finally, we performed a large-scale comparison with other attempts that have tried to improve the accuracy of the simple Bayes algorithm as well as other state-of-the-art algorithms and ensembles on 26 standard benchmark datasets and we took better accuracy in most cases using less time for training, too.

[1]  João Gama,et al.  Iterative Bayes , 2000, Intell. Data Anal..

[2]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[3]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[4]  Gregory M. Provan,et al.  Efficient Learning of Selective Bayesian Network Classifiers , 1996, ICML.

[5]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[6]  Geoffrey I. Webb,et al.  Lazy Learning of Bayesian Rules , 2000, Machine Learning.

[7]  Dimitrios Gunopulos,et al.  Feature selection for the naive bayesian classifier using decision trees , 2003, Appl. Artif. Intell..

[8]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[9]  Usama M. Fayyad,et al.  Multi-Interval Discretization of Continuous-Valued Attributes for Classification Learning , 1993, IJCAI.

[10]  Michael J. Pazzani,et al.  Searching for Dependencies in Bayesian Classifiers , 1995, AISTATS.

[11]  J. Ross Quinlan,et al.  Bagging, Boosting, and C4.5 , 1996, AAAI/IAAI, Vol. 1.

[12]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[13]  Thomas Richardson,et al.  Interpretable Boosted Naïve Bayes Classification , 1998, KDD.

[14]  Rajeev Sharma,et al.  Advances in Neural Information Processing Systems 11 , 1999 .

[15]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[16]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[17]  Johanna D. Moore,et al.  Proceedings of the Thirteenth National Conference on Artificial Intelligence and Eighth Innovative Applications of Artificial Intelligence Conference, AAAI 96, IAAI 96, Portland, Oregon, USA, August 4-8, 1996, Volume 1 , 1996, AAAI.

[18]  Alexey Tsymbal,et al.  Feature Selection for Ensembles of Simple Bayesian Classifiers , 2002, ISMIS.

[19]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[20]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[21]  Steven Salzberg,et al.  On Comparing Classifiers: Pitfalls to Avoid and a Recommended Approach , 1997, Data Mining and Knowledge Discovery.

[22]  David W. Aha,et al.  Lazy Learning , 1997, Springer Netherlands.

[23]  Kai Ming Ting,et al.  Improving the Performance of Boosting for Naive Bayesian Classification , 1999, PAKDD.

[24]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[25]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[26]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[27]  Jerome H. Friedman,et al.  On Bias, Variance, 0/1—Loss, and the Curse-of-Dimensionality , 2004, Data Mining and Knowledge Discovery.

[28]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[29]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[30]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..