New Applications of Ensembles of Classifiers

Combination (ensembles) of classifiers is now a well established research line. It has been observed that the predictive accuracy of a combination of independent classifiers excels that of the single best classifier. While ensembles of classifiers have been mostly employed to achieve higher recognition accuracy, this paper focuses on the use of combinations of individual classifiers for handling several problems from the practice in the machine learning, pattern recognition and data mining domains. In particular, the study presented concentrates on managing the imbalanced training sample problem, scaling up some preprocessing algorithms and filtering the training set. Here, all these situations are examined mainly in connection with the nearest neighbour classifier. Experimental results show the potential of multiple classifier systems when applied to those situations.

[1]  Stan Matwin,et al.  Machine Learning for the Detection of Oil Spills in Satellite Radar Images , 1998, Machine Learning.

[2]  Michael J. Pazzani,et al.  Reducing Misclassification Costs , 1994, ICML.

[3]  Francesc J. Ferri,et al.  Another move toward the minimum consistent subset: a tabu search approach to the condensed nearest neighbor rule , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Ethem Alpaydin,et al.  Voting over Multiple Condensed Nearest Neighbors , 1997, Artificial Intelligence Review.

[5]  George H. John Robust Decision Trees: Removing Outliers from Databases , 1995, KDD.

[6]  Jack Koplowitz,et al.  On the relation of performance to editing in nearest neighbor rules , 1981, Pattern Recognit..

[7]  Foster J. Provost,et al.  A Survey of Methods for Scaling Up Inductive Algorithms , 1999, Data Mining and Knowledge Discovery.

[8]  M. Narasimha Murty,et al.  An incremental prototype set building technique , 2002, Pattern Recognit..

[9]  Carla E. Brodley,et al.  Identifying Mislabeled Training Data , 1999, J. Artif. Intell. Res..

[10]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[11]  Ludmila I. Kuncheva,et al.  Editing for the k-nearest neighbors rule by a genetic algorithm , 1995, Pattern Recognit. Lett..

[12]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[13]  Salvatore J. Stolfo,et al.  Learning Arbiter and Combiner Trees from Partitioned Data for Scaling Machine Learning , 1995, KDD.

[14]  ci UniversityTR Voting over Multiple Condensed Nearest Neighbors , 1995 .

[15]  Thomas Reinartz,et al.  A Unifying View on Instance Selection , 2002, Data Mining and Knowledge Discovery.

[16]  James C. Bezdek,et al.  Nearest prototype classification: clustering, genetic algorithms, or random search? , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[17]  Stan Matwin,et al.  Addressing the Curse of Imbalanced Training Sets: One-Sided Selection , 1997, ICML.

[18]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[19]  I. Tomek,et al.  Two Modifications of CNN , 1976 .

[20]  David B. Skalak,et al.  Prototype Selection for Composite Nearest Neighbor Classifiers , 1995 .

[21]  José Salvador Sánchez,et al.  On Filtering the Training Prototypes in Nearest Neighbour Classification , 2002, CCIA.

[22]  I. Tomek An Experiment with the Edited Nearest-Neighbor Rule , 1976 .

[23]  Belur V. Dasarathy,et al.  Minimal consistent set (MCS) identification for optimal nearest neighbor decision systems design , 1994, IEEE Trans. Syst. Man Cybern..

[24]  George H. John Enhancements to the data mining process , 1997 .

[25]  C. G. Hilborn,et al.  The Condensed Nearest Neighbor Rule , 1967 .

[26]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[27]  V. Sridhar,et al.  Some applications of clustering in the design of neural networks , 1995, Pattern Recognit. Lett..

[28]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[29]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[30]  Fabio Roli,et al.  Dynamic classifier selection based on multiple classifier behaviour , 2001, Pattern Recognit..

[31]  Moninder Singh,et al.  Learning Goal Oriented Bayesian Networks for Telecommunications Risk Management , 1996, ICML.

[32]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[33]  G. Krishna,et al.  The condensed nearest neighbor rule using the concept of mutual nearest neighborhood (Corresp.) , 1979, IEEE Trans. Inf. Theory.

[34]  Hongbin Zhang,et al.  Optimal reference subset selection for nearest neighbor classification by tabu search , 2002, Pattern Recognit..

[35]  Roberto Alejo,et al.  Analysis of new techniques to obtain quality training sets , 2003, Pattern Recognit. Lett..

[36]  David J. Hand,et al.  Construction and Assessment of Classification Rules , 1997 .

[37]  Hugh B. Woodruff,et al.  An algorithm for a selective nearest neighbor decision rule (Corresp.) , 1975, IEEE Trans. Inf. Theory.

[38]  Nitesh V. Chawla,et al.  Creating ensembles of classifiers , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[39]  Eduardo Gasca,et al.  Decontamination of Training Samples for Supervised Pattern Recognition Methods , 2000, SSPR/SPR.

[40]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[41]  Mark R. Wade,et al.  Construction and Assessment of Classification Rules , 1999, Technometrics.

[42]  Chris Mellish,et al.  Advances in Instance Selection for Instance-Based Learning Algorithms , 2002, Data Mining and Knowledge Discovery.

[43]  J A Swets,et al.  Better decisions through science. , 2000, Scientific American.

[44]  Wai Lam,et al.  Discovering Useful Concept Prototypes for Classification Based on Filtering and Abstraction , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[45]  Stephen D. Bay Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.

[46]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[47]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[48]  Nathalie Japkowicz,et al.  A Recognition-Based Alternative to Discrimination-Based Multi-layer Perceptrons , 2000, Canadian Conference on AI.

[49]  Ludmila I. Kuncheva,et al.  Measures of Diversity in Classifier Ensembles and Their Relationship with the Ensemble Accuracy , 2003, Machine Learning.

[50]  Filiberto Pla,et al.  Prototype selection for the nearest neighbour rule through proximity graphs , 1997, Pattern Recognit. Lett..

[51]  T. Hassard,et al.  Applied Linear Regression , 2005 .