Local distance-based classification

In this paper, we have introduced a new method in which every training point learns what is happening in its neighborhood. So, a hyperplane is learned and associated to each point. With this hyperplane we can define the bands distance, a distance measure that bring closer or move away points depending on its classes. We have used this new distance in classification tasks and have performed tests over 68 datasets: 18 well-known UCI-Repository datasets, one private dataset, and 49 ad hoc synthetic datasets. We have used 10-fold cross-validation and, in order to compare the results of the classifiers, we have considered the mean accuracy and have also performed a paired two-tailored t-Student's test with a significance level of 95%. The results are encouraging and confirm the good behavior of the new proposed classification method. The bands distance has obtained the best overall results with 1-NN and k-NN classifiers when compared with other distances. Finally, we extract conclusions and outline some lines of future work.

[1]  Thomas G. Dietterich,et al.  A study of distance-based machine learning algorithms , 1994 .

[2]  David W. Aha,et al.  Tolerating Noisy, Irrelevant and Novel Attributes in Instance-Based Learning Algorithms , 1992, Int. J. Man Mach. Stud..

[3]  David W. Aha,et al.  Lazy Learning , 1997, Springer Netherlands.

[4]  M. Richter Classification and Learning of Similarity Measures , 1993 .

[5]  Juan Luis Castro,et al.  Similarity relations based on distances as fuzzy concepts , 2001, EUSFLAT Conf..

[6]  Michael M. Richter,et al.  On the Notion of Similarity in Case-Based Reasoning , 1995 .

[7]  S. Salzberg,et al.  A weighted nearest neighbor algorithm for learning with symbolic features , 2004, Machine Learning.

[8]  S. Salzberg A nearest hyperrectangle learning method , 2004, Machine Learning.

[9]  Steven Salzberg,et al.  A Weighted Nearest Neighbor Algorithm for Learning with Symbolic Features , 2004, Machine Learning.

[10]  Pedro M. Domingos Control-Sensitive Feature Selection for Lazy Learners , 1997, Artificial Intelligence Review.

[11]  David L. Waltz,et al.  Toward memory-based reasoning , 1986, CACM.

[12]  Francesco Ricci,et al.  Learning a Local Similarity Metric for Case-Based Reasoning , 1995, ICCBR.

[13]  Tony R. Martinez,et al.  Improved Heterogeneous Distance Functions , 1996, J. Artif. Intell. Res..

[14]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[15]  J. L. Hodges,et al.  Discriminatory Analysis - Nonparametric Discrimination: Consistency Properties , 1989 .

[16]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[17]  Thomas G. Dietterich,et al.  An experimental comparison of the nearest-neighbor and nearest-hyperrectangle algorithms , 1995, Machine Learning.

[18]  Juan Castro,et al.  Algorithms for Classification based on k-NN , 2007 .

[19]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[20]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[21]  Francesco Ricci,et al.  Data Compression and Local Metrics for Nearest Neighbor Classification , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  David W. Aha,et al.  Instance-Based Learning Algorithms , 1991, Machine Learning.

[23]  Ramón López de Mántaras,et al.  On the Importance of Similitude: An Entropy-Based Assessment , 1996, EWCBR.