A Modification on K-Nearest Neighbor Classifier

K-Nearest Neighbor (KNN) classification is one of the most fundamental and simple classification methods. When there is little or no prior knowledge about the distribution of the data, the KNN method should be one of the first choices for classification. In this paper a modification is taken to improve the performance of KNN. The main idea is to use robust neighbors in training data. This modified KNN is better from traditional KNN in both terms: robustness and performance. The proposed KNN classification is called Modified K-Nearest Neighbor (MKNN). Inspired from the traditional KNN algorithm, the main idea is to classify an input query according to the most frequent tag in set of neighbor tags. MKNN can be considered a kind of weighted KNN so that the query label is approximated by weighting the neighbors of the query. The procedure computes the frequencies of the same labeled neighbors to the total number of neighbors. The proposed method is evaluated on a variety of several standard UCI data sets. Experiments show the excellent improvement in the performance of KNN method.

[1]  David G. Stork,et al.  Pattern classification, 2nd Edition , 2000 .

[2]  Larry D. Hostetler,et al.  k-nearest-neighbor Bayes-risk estimation , 1975, IEEE Trans. Inf. Theory.

[3]  Martin E. Hellman,et al.  The Nearest Neighbor Classification Rule with a Reject Option , 1970, IEEE Trans. Syst. Sci. Cybern..

[4]  A. Jówik,et al.  A learning scheme for a fuzzy k-NN rule , 1983 .

[5]  Hamid Parvin,et al.  Using Clustering for Generating Diversity in Classifier Ensemble , 2009, J. Digit. Content Technol. its Appl..

[6]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[7]  Earl E. Gose,et al.  Pattern Recognition and Image Analysis , 2011, Lecture Notes in Computer Science.

[8]  Hamid Parvin,et al.  A New Approach to Improve the Vote-Based Classifier Selection , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[9]  Hamid Parvin,et al.  A New Method for Constructing Classifier Ensembles , 2009, J. Digit. Content Technol. its Appl..

[10]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[11]  Sergio Bermejo,et al.  Adaptive soft k-nearest-neighbour classifiers , 2000, Pattern Recognit..

[12]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[13]  H. Alizadeh,et al.  Divide & Conquer Classification and Optimization by Genetic Algorithm , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[14]  Anil K. Jain,et al.  NOTE ON DISTANCE-WEIGHTED k-NEAREST NEIGHBOR RULES. , 1978 .

[15]  Morteza Analoui,et al.  CCHR: Combination of Classifiers Using Heuristic Retraining , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Morteza Analoui,et al.  A Scalable Method for Improving the Performance of Classifiers in Multiclass Applications by Pairwise Classifiers and GA , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[18]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.