A Study of the Robustness of KNN Classifiers Trained Using Soft Labels

Supervised learning models most commonly use crisp labels for classifier training. Crisp labels fail to capture the data characteristics when overlapping classes exist. In this work we attempt to compare between learning using soft and hard labels to train K-nearest neighbor classifiers. We propose a new technique to generate soft labels based on fuzzy-clustering of the data and fuzzy relabelling of cluster prototypes. Experiments were conducted on five data sets to compare between classifiers that learn using different types of soft labels and classifiers that learn with crisp labels. Results reveal that learning with soft labels is more robust against label errors opposed to learning with crisp labels. The proposed technique to find soft labels from the data, was also found to lead to a more robust training in most data sets investigated.

[1]  Oscar Castillo,et al.  Hybrid Intelligent Systems for Pattern Recognition Using Soft Computing - An Evolutionary Approach for Neural Networks and Fuzzy Systems , 2005, Studies in Fuzziness and Soft Computing.

[2]  James C. Bezdek,et al.  Two soft relatives of learning vector quantization , 1995, Neural Networks.

[3]  Klaus Obermayer,et al.  Soft nearest prototype classification , 2003, IEEE Trans. Neural Networks.

[4]  Paul Scheunders,et al.  Wavelet-FILVQ classifier for speech analysis , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[5]  David G. Stork,et al.  Evaluating Classifiers by Means of Test Data with Noisy Labels , 2003, IJCAI.

[6]  Sankar K. Pal,et al.  Multilayer perceptron, fuzzy sets, and classification , 1992, IEEE Trans. Neural Networks.

[7]  Dana Angluin,et al.  Learning from noisy examples , 1988, Machine Learning.

[8]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .

[9]  James M. Keller,et al.  Incorporating Fuzzy Membership Functions into the Perceptron Algorithm , 1985, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Ludmila I. Kuncheva,et al.  Fuzzy Classifier Design , 2000, Studies in Fuzziness and Soft Computing.

[11]  James C. Bezdek,et al.  An Integrated Framework for Generalized Nearest Prototype Classifier Design , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  Rong Jin,et al.  Learning with Multiple Labels , 2002, NIPS.

[13]  Frank-Michael Schleif,et al.  Fuzzy Labeled Neural GAS for Fuzzy Classification , 2005 .

[14]  James C. Bezdek,et al.  Repairs to GLVQ: a new family of competitive learning schemes , 1996, IEEE Trans. Neural Networks.

[15]  Klaus Obermayer,et al.  Soft Learning Vector Quantization , 2003, Neural Computation.

[16]  Leen-Kiat Soh,et al.  Authoritative Citation KNN Learning with Noisy Training Datasets , 2004, IC-AI.

[17]  Sameer A. Nene,et al.  Columbia Object Image Library (COIL100) , 1996 .

[18]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[19]  Ludmila I. Kuncheva,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2004 .

[20]  George J. Klir,et al.  Fuzzy sets and fuzzy logic - theory and applications , 1995 .

[21]  Leen-Kiat Soh,et al.  Authoritative citation KNN learning in multiple-instance problems , 2004, 2004 International Conference on Machine Learning and Applications, 2004. Proceedings..

[22]  James M. Keller,et al.  A fuzzy K-nearest neighbor algorithm , 1985, IEEE Transactions on Systems, Man, and Cybernetics.

[23]  Thomas Villmann,et al.  Fuzzy Labeled Soft Nearest Neighbor Classification with Relevance Learning , 2005, Fourth International Conference on Machine Learning and Applications (ICMLA'05).

[24]  David G. Stork,et al.  Pattern Classification , 1973 .

[25]  Jacek M. Leski,et al.  Fuzzy and Neuro-Fuzzy Intelligent Systems , 2000, Studies in Fuzziness and Soft Computing.

[26]  Siegfried Gottwald,et al.  Fuzzy Sets and Fuzzy Logic , 1993 .