Creating diverse nearest-neighbour ensembles using simultaneous metaheuristic feature selection

The nearest-neighbour (1NN) classifier has long been used in pattern recognition, exploratory data analysis, and data mining problems. A vital consideration in obtaining good results with this technique is the choice of distance function, and correspondingly which features to consider when computing distances between samples. In recent years there has been an increasing interest in creating ensembles of classifiers in order to improve classification accuracy. This paper proposes a new ensemble technique which combines multiple 1NN classifiers, each using a different distance function, and potentially a different set of features (feature vector). These feature vectors are determined for each distance metric simultaneously using Tabu Search to minimise the ensemble error rate. We show that this approach implicitly selects for a diverse set of classifiers, and by doing so achieves greater performance improvements than can be achieved by treating the classifiers independently, or using a single feature set. Naturally, optimising the level of ensembles necessitates a much larger solution space, to make this approach tractable, we show how Tabu Search at the ensemble level can be hybridised with local search at the level of individual classifiers. The proposed ensemble classifier with different distance metrics and different feature vectors is evaluated using various benchmark datasets from UCI Machine Learning Repository and a real-world machine-vision application. Results have indicated a significant increase in the performance when compared with various well-known classifiers. Furthermore, the proposed ensemble method is also compared with ensemble classifier using different distance metrics but with same feature vector (with or without feature selection (FS)).

[1]  金田 重郎,et al.  C4.5: Programs for Machine Learning (書評) , 1995 .

[2]  Anil K. Jain,et al.  Statistical Pattern Recognition: A Review , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognition Letters.

[4]  Joydeep Ghosh,et al.  Adaptive feature selection for hyperspectral data analysis using a binary hierarchical classifier and tabu search , 2003, IGARSS 2003. 2003 IEEE International Geoscience and Remote Sensing Symposium. Proceedings (IEEE Cat. No.03CH37477).

[5]  Leon N. Cooper,et al.  Improving nearest neighbor rule with a simple adaptive distance measure , 2006, Pattern Recognit. Lett..

[6]  Muhammad Atif Tahir,et al.  Improving Nearest Neighbor Classifier Using Tabu Search and Ensemble Distance Metrics , 2006, Sixth International Conference on Data Mining (ICDM'06).

[7]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Weixiong Zhang Algorithms for Combinatorial Optimization , 1999 .

[9]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[10]  Paul Scheunders,et al.  Genetic feature selection combined with composite fuzzy nearest neighbor classifiers for hyperspectral satellite imagery , 2002, Pattern Recognit. Lett..

[11]  Anil K. Jain,et al.  Feature Selection: Evaluation, Application, and Small Sample Performance , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  Oleg Okun,et al.  Multiple Views in Ensembles of Nearest Neighbor Classifiers , 2005 .

[13]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[14]  Alexey Tsymbal,et al.  Ensemble feature selection with the simple Bayesian classification , 2003, Inf. Fusion.

[15]  Edwin Lughofer,et al.  An On-Line Interactive Self-adaptive Image Classification Framework , 2008, ICVS.

[16]  Ahmed Bouridane,et al.  Novel Round-Robin Tabu Search Algorithm for Prostate Cancer Classification and Diagnosis Using Multispectral Imagery , 2006, IEEE Transactions on Information Technology in Biomedicine.

[17]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[18]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[19]  Ludmila I. Kuncheva Diversity in multiple classifier systems , 2005, Inf. Fusion.

[20]  Stephen D. Bay Combining Nearest Neighbor Classifiers Through Multiple Feature Subsets , 1998, ICML.

[21]  Anil K. Jain,et al.  Small Sample Size Effects in Statistical Pattern Recognition: Recommendations for Practitioners , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[23]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[24]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[25]  A. Wayne Whitney,et al.  A Direct Method of Nonparametric Measurement Selection , 1971, IEEE Transactions on Computers.

[26]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[27]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[28]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[29]  Ethem Alpaydin,et al.  Voting over Multiple Condensed Nearest Neighbors , 1997, Artificial Intelligence Review.

[30]  Lorenzo Bruzzone,et al.  A new search algorithm for feature selection in hyperspectral remote sensing images , 2001, IEEE Trans. Geosci. Remote. Sens..

[31]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[32]  Fred W. Glover,et al.  A user's guide to tabu search , 1993, Ann. Oper. Res..

[33]  Hongbin Zhang,et al.  Feature selection using tabu search method , 2002, Pattern Recognit..

[34]  Mykola Pechenizkiy,et al.  Diversity in Random Subspacing Ensembles , 2004, DaWaK.

[35]  Yoav Freund,et al.  Experiments with a New Boosting Algorithm , 1996, ICML.

[36]  Enrique Vidal,et al.  Learning weighted metrics to minimize nearest-neighbor classification error , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Fred Glover,et al.  Tabu Search - Part II , 1989, INFORMS J. Comput..

[38]  Padraig Cunningham,et al.  Diversity versus Quality in Classification Ensembles Based on Feature Selection , 2000, ECML.

[39]  Anil K. Jain,et al.  Dimensionality reduction using genetic algorithms , 2000, IEEE Trans. Evol. Comput..

[40]  Edoardo Amaldi,et al.  On the Approximability of Minimizing Nonzero Variables or Unsatisfied Relations in Linear Systems , 1998, Theor. Comput. Sci..

[41]  Josef Kittler,et al.  Floating search methods in feature selection , 1994, Pattern Recognit. Lett..

[42]  Naohiro Ishii,et al.  Combining Multiple k-Nearest Neighbor Classifiers Using Different Distance Functions , 2004, IDEAL.

[43]  Terence C. Fogarty,et al.  Genetic feature selection for clustering and classification , 1994 .

[44]  Leon N. Cooper,et al.  Improving nearest neighbor rule with a simple adaptive distance measure , 2007, Pattern Recognit. Lett..

[45]  Mineichi Kudo,et al.  Comparison of algorithms that select features for pattern classifiers , 2000, Pattern Recognit..

[46]  Stuart J. Russell,et al.  NP-Completeness of Searches for Smallest Possible Feature Sets , 1994 .

[47]  Tin Kam Ho,et al.  The Random Subspace Method for Constructing Decision Forests , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[48]  Ahmed Bouridane,et al.  Simultaneous feature selection and feature weighting using Hybrid Tabu Search/K-nearest neighbor classifier , 2007, Pattern Recognit. Lett..