A New Method for Improving the Performance of K Nearest Neighbor using Clustering Technique

In this paper, a new classification method is presented which uses clustering techniques to augment the performance of K-Nearest Neighbor algorithm. This new method is called Nearest Cluster approach, NC. In this algorithm the neighbor samples are automatically determined using clustering techniques. After partitioning the train set, the labels of cluster centers are determined. For specifying the class label of a new test sample, the class label of the nearest cluster prototype is used. Computationally, the NC method is faster than KNN, K times. Also the clustering techniques lead to find the best number of neighbors based on the nature of feature space. The proposed method is evaluated on two standard data sets, SAHeart and Monk. Experimental results show the excellent improvement both in accuracy and time complexity in comparison with the KNN method.

[1]  Harris Drucker,et al.  Improving Performance in Neural Networks Using a Boosting Algorithm , 1992, NIPS.

[2]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[3]  M. Mohammadi,et al.  Neural Network Ensembles Using Clustering Ensemble and Genetic Algorithm , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[4]  L. Shapley,et al.  Optimizing group judgmental accuracy in the presence of interdependencies , 1984 .

[5]  Hamid Parvin,et al.  A New Approach to Improve the Vote-Based Classifier Selection , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[7]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[8]  Yang Song,et al.  IKNN: Informative K-Nearest Neighbor Pattern Classification , 2007, PKDD.

[9]  Suid-Afrikaanse Tydskrif,et al.  The South African Medical Journal , 1884, Glasgow Medical Journal.

[10]  Anders Krogh,et al.  Neural Network Ensembles, Cross Validation, and Active Learning , 1994, NIPS.

[11]  H. Alizadeh,et al.  Divide & Conquer Classification and Optimization by Genetic Algorithm , 2008, 2008 Third International Conference on Convergence and Hybrid Information Technology.

[12]  H. Alizadeh Using Clustering Ensemble in Classification Problems , 2009 .

[13]  Pablo M. Granitto,et al.  A Learning Algorithm For Neural Network Ensembles , 2001, Inteligencia Artif..

[14]  David G. Stork,et al.  Pattern Classification , 1973 .

[15]  Lutgarde M. C. Buydens,et al.  KNN-kernel density-based clustering for high-dimensional multivariate data , 2006, Comput. Stat. Data Anal..

[16]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[17]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[18]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[19]  Chen Li,et al.  NNH: Improving Performance of Nearest-Neighbor Searches Using Histograms , 2004, EDBT.

[20]  Anil K. Jain,et al.  A Mixture Model for Clustering Ensembles , 2004, SDM.

[21]  Ana L. N. Fred,et al.  Finding Consistent Clusters in Data Partitions , 2001, Multiple Classifier Systems.

[22]  David H. Wolpert,et al.  Stacked generalization , 1992, Neural Networks.

[23]  M. Roveri,et al.  Reducing Computational Complexity in k-NN based Adaptive Classifiers , 2007, 2007 IEEE International Conference on Computational Intelligence for Measurement Systems and Applications.

[24]  Morteza Analoui,et al.  A Scalable Method for Improving the Performance of Classifiers in Multiclass Applications by Pairwise Classifiers and GA , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[25]  Robert Tibshirani,et al.  Discriminant Adaptive Nearest Neighbor Classification , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Morteza Analoui,et al.  CCHR: Combination of Classifiers Using Heuristic Retraining , 2008, 2008 Fourth International Conference on Networked Computing and Advanced Information Management.

[27]  Dimitrios Gunopulos,et al.  Locally Adaptive Metric Nearest-Neighbor Classification , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[28]  Hamid Parvin,et al.  A New Method for Constructing Classifier Ensembles , 2009, J. Digit. Content Technol. its Appl..

[29]  Ching Y. Suen,et al.  Application of majority voting to pattern recognition: an analysis of its behavior and performance , 1997, IEEE Trans. Syst. Man Cybern. Part A.