Memory-Based Classification with Dynamic Feature Selection Using Self-Organizing Maps for Pattern Evaluation

Memory-based learning is one of the main fields in the area of machine learning. We propose a new methodology for addressing the classification task that relies on the main idea of the k-nearest neighbors algorithm, which is the most important representative of this field. In the proposed approach, given an unclassified pattern, a set of neighboring patterns is found, not necessarily using all input feature dimensions. Also, following the concept of the naive Bayesian classifier, we adopt the assumption of independence of input features in the outcome of the classification task. The two concepts are merged in an attempt to take advantage of their good performance features. In order to further improve the performance of our approach, we propose a novel weighting scheme of the memory-base. Using the self-organizing maps model during the execution of the algorithm, dynamic weights of the memory-base patterns are produced. Experimental results have shown improved performance of the proposed method in comparison with the aforementioned algorithms and their variations.

[1]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[2]  Tony R. Martinez,et al.  Reduction Techniques for Instance-Based Learning Algorithms , 2000, Machine Learning.

[3]  Dennis L. Wilson,et al.  Asymptotic Properties of Nearest Neighbor Rules Using Edited Data , 1972, IEEE Trans. Syst. Man Cybern..

[4]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[5]  Jihoon Yang,et al.  Feature Subset Selection Using a Genetic Algorithm , 1998, IEEE Intell. Syst..

[6]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[7]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[8]  Thomas G. Dietterich,et al.  An Experimental Comparison of the Nearest-Neighbor and Nearest-Hyperrectangle Algorithms , 1995, Machine Learning.

[9]  E. Parzen On Estimation of a Probability Density Function and Mode , 1962 .

[10]  Ethem Alpaydin,et al.  Voting over Multiple Condensed Nearest Neighbors , 1997, Artificial Intelligence Review.

[11]  Marko Robnik-Sikonja,et al.  Theoretical and Empirical Analysis of ReliefF and RReliefF , 2003, Machine Learning.

[12]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[13]  Ron Kohavi,et al.  Wrappers for Feature Subset Selection , 1997, Artif. Intell..

[14]  Sakir Kocabas Conflict Resolution as Discovery in Particle Physics , 2005, Machine Learning.

[15]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[16]  Jack Sklansky,et al.  A note on genetic algorithms for large-scale feature selection , 1989, Pattern Recognit. Lett..

[17]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .

[18]  David G. Lowe,et al.  Similarity Metric Learning for a Variable-Kernel Classifier , 1995, Neural Computation.