Locally adaptive k parameter selection for nearest neighbor classifier: one nearest cluster

The k nearest neighbors (k-NN) classification technique has a worldly wide fame due to its simplicity, effectiveness, and robustness. As a lazy learner, k-NN is a versatile algorithm and is used in many fields. In this classifier, the k parameter is generally chosen by the user, and the optimal k value is found by experiments. The chosen constant k value is used during the whole classification phase. The same k value used for each test sample can decrease the overall prediction performance. The optimal k value for each test sample should vary from others in order to have more accurate predictions. In this study, a dynamic k value selection method for each instance is proposed. This improved classification method employs a simple clustering procedure. In the experiments, more accurate results are found. The reasons of success have also been understood and presented.

[1]  Yaxin Bi,et al.  KNN Model-Based Approach in Classification , 2003, OTM.

[2]  Leon N. Cooper,et al.  Improving nearest neighbor rule with a simple adaptive distance measure , 2006, Pattern Recognit. Lett..

[3]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[4]  Abdelhak M. Zoubir,et al.  Resampling methods for quality assessment of classifier performance and optimal number of features , 2013, Signal Process..

[5]  Glenn J. Myatt Making Sense of Data , 2007 .

[6]  Mehmet Fatih Amasyali,et al.  KNN parameter selection via meta learning , 2013, 2013 21st Signal Processing and Communications Applications Conference (SIU).

[7]  A. Asuncion,et al.  UCI Machine Learning Repository, University of California, Irvine, School of Information and Computer Sciences , 2007 .

[8]  Glenn J. Myatt Making Sense of Data I: A Practical Guide to Exploratory Data Analysis and Data Mining , 2006 .

[9]  A. Ghosh On Nearest Neighbor Classification Using Adaptive Choice of k , 2007 .

[10]  Dianhong Wang,et al.  Survey of Improving K-Nearest-Neighbor for Classification , 2007, Fourth International Conference on Fuzzy Systems and Knowledge Discovery (FSKD 2007).

[11]  Vandana,et al.  Survey of Nearest Neighbor Techniques , 2010, ArXiv.

[12]  Tin Kam Ho,et al.  Complexity Measures of Supervised Classification Problems , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Filiberto Pla,et al.  On the use of neighbourhood-based non-parametric classifiers , 1997, Pattern Recognit. Lett..

[14]  Feiping Nie,et al.  Robust Distance Metric Learning via Simultaneous L1-Norm Minimization and Maximization , 2014, ICML.

[15]  Feiping Nie,et al.  Learning a Mahalanobis distance metric for data clustering and classification , 2008, Pattern Recognit..

[16]  A. Ghosh On optimum choice of k in nearest neighbor classification , 2006 .

[17]  Clifford A. Shaffer Data Structures & Algorithm Analysis in C++ , 2013 .

[18]  C. A. Murthy,et al.  Multiscale Classification Using Nearest Neighbor Density Estimates , 2006, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[19]  Miloud-Aouidate Amal,et al.  Survey of Nearest Neighbor Condensing Techniques , 2011 .

[20]  Mark A. Weiss Data Structures & Algorithm Analysis in C++ , 2012 .

[21]  Alexandros Agapitos,et al.  Adaptive Distance Metrics for Nearest Neighbour Classification Based on Genetic Programming , 2013, EuroGP.