WBC SVM : Weighted Bayesian Classification based on Support Vector Machines

This paper introduces an algorithm that combines naïve Bayes classification with feature weighting. Most of the related approaches to feature transformation for naïve Bayes suggest various heuristics and non-exhaustive search strategies for selecting a subset of features with which naïve Bayes performs better than with the complete set of features. In contrast, the algorithm introduced in this paper employs feature weighting performed by a support vector machine. The weights are optimised such that the danger of overfitting is reduced. To the best of our knowledge, this is the first time that naïve Bayes classification has been combined with feature weighting. Experimental results on 15 UCI domains demonstrate that WBCSVM compares favourably to state-of-the-art machine learning approaches.

[1]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[2]  Igor Kononenko,et al.  Estimating Attributes: Analysis and Extensions of RELIEF , 1994, ECML.

[3]  Pat Langley,et al.  Induction of Selective Bayesian Classifiers , 1994, UAI.

[4]  Yoram Singer,et al.  A simple, fast, and effective rule learner , 1999, AAAI 1999.

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[7]  Vladimir Vapnik,et al.  Chervonenkis: On the uniform convergence of relative frequencies of events to their probabilities , 1971 .

[8]  Stephen D. Bay Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.

[9]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques with Java implementations , 2002, SGMD.

[10]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.

[11]  Geoffrey Holmes,et al.  Benchmarking attribute selection techniques for data mining , 2000 .

[12]  Bernhard Schölkopf,et al.  An Introduction to Support Vector Machines , 2003 .

[13]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[14]  David Haussler,et al.  Exploiting Generative Models in Discriminative Classifiers , 1998, NIPS.

[15]  Peter A. Flach,et al.  Decomposing Probability Distributions on Structured Individuals , 2000, ILP Work-in-progress reports.

[16]  Larry A. Rendell,et al.  A Practical Approach to Feature Selection , 1992, ML.

[17]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..