Feature Selection for Ensembles of Simple Bayesian Classifiers

A popular method for creating an accurate classifier from a set of training data is to train several classifiers, and then to combine their predictions. The ensembles of simple Bayesian classifiers have traditionally not been a focus of research. However, the simple Bayesian classifier has much broader applicability than previously thought. Besides its high classification accuracy, it also has advantages in terms of simplicity, learning speed, classification speed, storage space, and incrementality. One way to generate an ensemble of simple Bayesian classifiers is to use different feature subsets as in the random subspace method. In this paper we present a technique for building ensembles of simple Bayesian classifiers in random subspaces. We consider also a hill-climbing-based refinement cycle, which improves accuracy and diversity of the base classifiers. We conduct a number of experiments on a collection of real-world and synthetic data sets. In many cases the ensembles of simple Bayesian classifiers have significantly higher accuracy than the single "global" simple Bayesian classifier. We consider several methods for integration of simple Bayesian classifiers. The dynamic integration better utilizes ensemble diversity than the static integration.

[1]  Ted Pedersen,et al.  A Simple Approach to Building Ensembles of Naive Bayesian Classifiers for Word Sense Disambiguation , 2000, ANLP.

[2]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[3]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Michael A. Arbib,et al.  The handbook of brain theory and neural networks , 1995, A Bradford book.

[5]  Lars Kai Hansen,et al.  Neural Network Ensembles , 1990, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[8]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[9]  Alexey Tsymbal,et al.  Local Feature Selection with Dynamic Integration of Classifiers , 2001, Fundam. Informaticae.

[10]  Charles Elkan,et al.  Boosting and Naive Bayesian learning , 1997 .

[11]  David W. Opitz,et al.  Feature Selection for Ensembles , 1999, AAAI/IAAI.

[12]  Robert P. W. Duin,et al.  Bagging and the Random Subspace Method for Redundant Feature Spaces , 2001, Multiple Classifier Systems.

[13]  Alexey Tsymbal,et al.  Ensemble Feature Selection with Dynamic Integration of Classifiers , 2001 .

[14]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[15]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[16]  Alexey Tsymbal,et al.  A Dynamic Integration Algorithm for an Ensemble of Classifiers , 1999, ISMIS.