Improving Naive Bayes for Classification

Abstract Naive Bayes (NB) is one of the widely used algorithms for classification. However, its conditional independence assumption harms its performance to some extent. Thus, many algorithms are presented to improve its classification accuracy. In this paper, we single out another two improved algorithms: instance weighted naive Bayes (IWNB) and combined neighbourhood naive Bayes (CNNB). In IWNB, each training instance is firstly weighted according to the similarity between it and the mode of the training instances, and then a NB classifier is built on the weighted training instances. In CNNB, multiple NB are firstly built on multiple neighbourhoods with different radius values for a test instance, and then their class probability estimates are averaged to estimate the class probability of the test instance. We experimentally tested IWNB and CNNB using the whole 36 University of California, Irvine (UCI) data sets selected by Weka, and compared them with NB. The experimental results show that IWNB and CNNB all significantly outperform NB.

[1]  Nir Friedman,et al.  Bayesian Network Classifiers , 1997, Machine Learning.

[2]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[3]  Mong-Li Lee,et al.  SNNB: A Selective Neighborhood Based Naïve Bayes for Lazy Learning , 2002, PAKDD.

[4]  Xuesong Yan,et al.  Survey of Improving Naive Bayes for Classification , 2007, ADMA.

[5]  Ian H. Witten,et al.  Data mining - practical machine learning tools and techniques, Second Edition , 2005, The Morgan Kaufmann series in data management systems.

[6]  lt,et al.  Learning Averaged One-Dependence Estimators by instance weighting , 2008 .

[7]  Igor Kononenko,et al.  Semi-Naive Bayesian Classifier , 1991, EWSL.

[8]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[9]  Bernhard Pfahringer,et al.  Locally Weighted Naive Bayes , 2002, UAI.

[10]  Yoshua Bengio,et al.  Inference for the Generalization Error , 1999, Machine Learning.

[11]  Charles Elkan,et al.  Boosting and Naive Bayesian learning , 1997 .

[12]  Deng Wei Weighted Naive Bayes Classification Algorithm Based on Rough Set , 2007 .

[13]  Mark A. Hall,et al.  A decision tree-based attribute weighting filter for naive Bayes , 2006, Knowl. Based Syst..

[14]  D. Gunopulos,et al.  Scaling up the Naive Bayesian Classifier : Using Decision Trees for Feature Selection , 2002 .

[15]  Harry Zhang,et al.  Learning weighted naive Bayes with accurate ranking , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Liangxiao Jiang,et al.  Hidden Naive Bayes , 2005, AAAI.

[18]  Eamonn J. Keogh,et al.  Learning augmented Bayesian classifiers: A comparison of distribution-based and classification-based approaches , 1999, AISTATS.

[19]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[20]  Charles X. Ling,et al.  An Improved Learning Algorithm for Augmented Naive Bayes , 2001, PAKDD.

[21]  Shifu Chen,et al.  A Double Layer Bayesian Classifier , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[22]  Geoffrey I. Webb,et al.  Not So Naive Bayes: Aggregating One-Dependence Estimators , 2005, Machine Learning.