Improving Performance of the k-Nearest Neighbor Classifier by Combining Feature Selection with Feature Weighting

The k-nearest neighbor (k-NN) classification is a simple and effective classification approach. However, it suffers from over-sensitivity problem due to irrelevant and noisy features. There are two ways to relax such sensitivity. One is to assign each feature a weight, and the other way is to select a subset of relevant features. Existing researches showed that both approaches can improve generalization accuracy, but it is impossible to predict which one is better for a specific dataset. In this paper, we propose an algorithm to improve the effectiveness of k-NN by combining these two approaches. Specifically, we select all relevant features firstly, and then assign a weight to each relevant feature. Experiments have been conducted on 14 datasets from the UCI Machine Learning Repository, and the results show that our algorithm achieves the highest accuracy or near to the highest accuracy on all test datasets. It increases generalization accuracy 8.68% on the average. It also achieves higher generalization accuracy compared with well-known machine learning algorithm IB1-4 and C4.5.

[1]  Mark A. Hall,et al.  Correlation-based Feature Selection for Machine Learning , 2003 .

[2]  Farhi Marir,et al.  Case-based reasoning: a categorized bibliography , 1994, The Knowledge Engineering Review.

[3]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[4]  Seishi Okamoto,et al.  An Average-Case Analysis of k-Nearest Neighbor Classifier , 1995, ICCBR.

[5]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[6]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[7]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[8]  Nobuhiro Yugami,et al.  Theoretical Analysis of the Nearest Neighbor Classifier in Noisy Domains , 1996, ICML.

[9]  Hiroshi Motoda,et al.  Feature Selection for Knowledge Discovery and Data Mining , 1998, The Springer International Series in Engineering and Computer Science.

[10]  Claire Cardie,et al.  A Cognitive Bias Approach to Feature Selection and Weighting for Case-Based Learners , 2000, Machine Learning.

[11]  Roberto Battiti,et al.  Using mutual information for selecting features in supervised neural net learning , 1994, IEEE Trans. Neural Networks.

[12]  David W. Aha,et al.  Feature Weighting for Lazy Learning Algorithms , 1998 .

[13]  Kee-Cheol Lee A Technique of Dynamic Feature Selection Using the Feature Group Mutual Information , 1999, PAKDD.

[14]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[15]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[16]  Nobuhiro Yugami,et al.  An Average-Case Analysis of the k-Nearest Neighbar Classifier for Noisy Domains , 1997, IJCAI.

[17]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[18]  Sahibsingh A. Dudani The Distance-Weighted k-Nearest-Neighbor Rule , 1976, IEEE Transactions on Systems, Man, and Cybernetics.

[19]  Ron Kohavi,et al.  The Utility of Feature Weighting in Nearest-Neighbor Algorithms , 1997 .

[20]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[21]  A. L. Barker,et al.  Selection of distance metrics and feature subsets for K-nearest neighbor classifiers , 1997 .

[22]  Xiaoyong Du,et al.  Improving performance of the k-nearest neighbor classifier by tolerant rough sets , 2001, Proceedings of the Third International Symposium on Cooperative Database Systems for Advanced Applications. CODAS 2001.

[23]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[24]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[25]  Jianping Zhang,et al.  Selecting Typical Instances in Instance-Based Learning , 1992, ML.

[26]  Tony R. Martinez,et al.  An Integrated Instance‐Based Learning Algorithm , 2000, Comput. Intell..

[27]  Ken Satoh,et al.  An Average Predictive Accuracy of the Nearest Neighbor Classifier , 1994, EWCBR.

[28]  David L. Waltz,et al.  Trading MIPS and memory for knowledge engineering , 1992, CACM.

[29]  David W. Aha,et al.  Towards a Better Understanding of Memory-based Reasoning Systems , 1994, ICML.

[30]  David B. Skalak,et al.  Prototype and Feature Selection by Sampling and Random Mutation Hill Climbing Algorithms , 1994, ICML.

[31]  Lawrence Davis,et al.  Hybridizing the Genetic Algorithm and the K Nearest Neighbors Classification Algorithm , 1991, ICGA.

[32]  David W. Aha,et al.  Feature Selection for Case-Based Classification of Cloud Types: An Empirical Comparison , 1994 .

[33]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .