Soft partitions lead to better learned ensembles

Ensembles of classifiers often provide better classification accuracy than a single classifier. One approach to creating ensembles is to create different subsets of the training data. We present a method of creating ensembles of classifiers by partitioning the dataset into regions using clustering. Learners are assigned to each region and the ensemble classification occurs by querying the learned classifier. The first strategy considered for partitioning the training set is to generate a hard, non-overlapping partition. This approach is shown to perform worse than a single classifier using the entire training set. However, the use of soft partitions significantly improves the overall ensemble performance. Three different methods of creating soft partitions are considered: a simple distance ratio, and both the fuzzy c-means and possibilistic c-means membership functions. All three methods are found to improve overall classifier performance beyond hard partitioning and often perform better than the base classifier using the entire training set. Experiments on six datasets illustrate the improved accuracy from creating ensembles on soft partitions of data.

[1]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Nitesh V. Chawla,et al.  Creating ensembles of classifiers , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[3]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[4]  HoTin Kam The Random Subspace Method for Constructing Decision Forests , 1998 .

[5]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition , 1992 .

[6]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[7]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[8]  Thomas G. Dietterich Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.

[9]  Paul S. Bradley,et al.  Scaling Clustering Algorithms to Large Databases , 1998, KDD.

[10]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[11]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[12]  Charles Elkan,et al.  Scalability for clustering algorithms revisited , 2000, SKDD.

[13]  Michael I. Jordan,et al.  Hierarchical Mixtures of Experts and the EM Algorithm , 1994, Neural Computation.

[14]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..