Ensemble of a subset of kNN classifiers

Combining multiple classifiers, known as ensemble methods, can give substantial improvement in prediction performance of learning algorithms especially in the presence of non-informative features in the data sets. We propose an ensemble of subset of kNN classifiers, ESkNN, for classification task in two steps. Firstly, we choose classifiers based upon their individual performance using the out-of-sample accuracy. The selected classifiers are then combined sequentially starting from the best model and assessed for collective performance on a validation data set. We use bench mark data sets with their original and some added non-informative features for the evaluation of our method. The results are compared with usual kNN, bagged kNN, random kNN, multiple feature subset method, random forest and support vector machines. Our experimental comparisons on benchmark classification problems and simulated data sets reveal that the proposed ensemble gives better classification performance than the usual kNN and its ensembles, and performs comparable to random forest and support vector machines.

[1]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, Probability Theory Group (Formerly: E1071), TU Wien [R package e1071 version 1.7-4] , 2020 .

[2]  Kurt Hornik,et al.  kernlab - An S4 Package for Kernel Methods in R , 2004 .

[3]  A. Zeileis Econometric Computing with HC and HAC Covariance Matrix Estimators , 2004 .

[4]  Andrew Harrison,et al.  A feature selection method for classification within functional genomics experiments based on the proportional overlapping score , 2014, BMC Bioinformatics.

[5]  David Mease,et al.  Boosted Classification Trees and Class Probability/Quantile Estimation , 2007, J. Mach. Learn. Res..

[6]  Ludwig Lausser,et al.  Identifying predictive hubs to condense the training set of \documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$$k$$\end{do , 2012, Computational Statistics.

[7]  H. Altay Güvenir,et al.  WEIGHTED K NEAREST NEIGHBOR CLASSIFICATION ON FEATURE PROJECTIONS , 2010 .

[8]  Szymon Grabowski,et al.  Voting over multiple k-NN classifiers , 2002, Modern Problems of Radio Engineering, Telecommunications and Computer Science (IEEE Cat. No.02EX542).

[9]  R. Samworth,et al.  Random‐projection ensemble classification , 2015, 1504.04595.

[10]  Stephen D. Bay Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.

[11]  Xiaomin Zhao,et al.  Feature selection for fault level diagnosis of planetary gearboxes , 2014, Adv. Data Anal. Classif..

[12]  L. Breiman OUT-OF-BAG ESTIMATION , 1996 .

[13]  Torsten Hothorn,et al.  Bundling Classifiers by Bagging Trees , 2002, Comput. Stat. Data Anal..

[14]  Kurt Hornik,et al.  Misc Functions of the Department of Statistics, ProbabilityTheory Group (Formerly: E1071), TU Wien , 2015 .

[15]  Donald A. Adjeroh,et al.  Random KNN feature selection - a fast and stable alternative to Random Forests , 2011, BMC Bioinformatics.

[16]  Albert Fornells,et al.  A study of the effect of different types of noise on the precision of supervised learning techniques , 2010, Artificial Intelligence Review.

[17]  Carlotta Domeniconi,et al.  Nearest neighbor ensemble , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[18]  Raymond J. Mooney,et al.  Experiments on Ensembles with Missing and Noisy Data , 2004, Multiple Classifier Systems.

[19]  Eric Bauer,et al.  An Empirical Comparison of Voting Classification Algorithms: Bagging, Boosting, and Variants , 1999, Machine Learning.

[20]  D. Opitz,et al.  Popular Ensemble Methods: An Empirical Study , 1999, J. Artif. Intell. Res..

[21]  Peter A. Flach,et al.  A Unified View of Performance Metrics: Translating Threshold Choice into Expected Classification Loss C` Esar Ferri , 2012 .

[22]  Torsten Hothorn,et al.  Bagging survival trees , 2002, Statistics in medicine.

[23]  Torsten Hothorn,et al.  Bagging Tree Classifiers for Laser Scanning Images: Data and Simulation Based Strategy , 2002, Artif. Intell. Medicine.

[24]  Torsten Hothorn,et al.  Double-Bagging: Combining Classifiers by Bootstrap Aggregation , 2002, Pattern Recognit..

[25]  Yang Yu,et al.  Adapt Bagging to Nearest Neighbor Classifiers , 2005, Journal of Computer Science and Technology.

[26]  R. Samworth Optimal weighted nearest neighbour classifiers , 2011, 1101.5783.

[27]  Rosa Maria Valdovinos,et al.  New Applications of Ensembles of Classifiers , 2003, Pattern Analysis & Applications.

[28]  Taghi M. Khoshgoftaar,et al.  Comparing Boosting and Bagging Techniques With Noisy and Imbalanced Data , 2011, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[29]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[30]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[31]  N. Obuchowski,et al.  Assessing the Performance of Prediction Models: A Framework for Traditional and Novel Measures , 2010, Epidemiology.

[32]  P. Hall,et al.  Properties of bagged nearest neighbour classifiers , 2005 .

[33]  Ludwig Lausser,et al.  Ensembles of Representative Prototype Sets for Classification and Data Set Analysis , 2013, ECDA.

[34]  Christian Weimar,et al.  Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications , 2014, Biometrical journal. Biometrische Zeitschrift.