Non-Euclidean or Non-metric Measures Can Be Informative

Statistical learning algorithms often rely on the Euclidean distance. In practice, non-Euclidean or non-metric dissimilarity measures may arise when contours, spectra or shapes are compared by edit distances or as a consequence of robust object matching [1,2]. It is an open issue whether such measures are advantageous for statistical learning or whether they should be constrained to obey the metric axioms. The k-nearest neighbor (NN) rule is widely applied to general dissimilarity data as the most natural approach. Alternative methods exist that embed such data into suitable representation spaces in which statistical classifiers are constructed [3]. In this paper, we investigate the relation between non-Euclidean aspects of dissimilarity data and the classification performance of the direct NN rule and some classifiers trained in representation spaces. This is evaluated on a parameterized family of edit distances, in which parameter values control the strength of non-Euclidean behavior. Our finding is that the discriminative power of this measure increases with increasing non-Euclidean and non-metric aspects until a certain optimum is reached. The conclusion is that statistical classifiers perform well and the optimal values of the parameters characterize a non-Euclidean and somewhat non-metric measure.

[1]  Josef Kittler,et al.  Pattern recognition : a statistical approach , 1982 .

[2]  Horst Bunke,et al.  On Not Making Dissimilarities Euclidean , 2004, SSPR/SPR.

[3]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[4]  Gabriela Andreu,et al.  Selecting the toroidal self-organizing feature maps (TSOFM) best organized to object recognition , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[5]  Robert P. W. Duin,et al.  A Generalized Kernel Approach to Dissimilarity-based Classification , 2002, J. Mach. Learn. Res..

[6]  Anil K. Jain,et al.  A modified Hausdorff distance for object matching , 1994, Proceedings of 12th International Conference on Pattern Recognition.

[7]  Horst Bunke,et al.  Syntactic and structural pattern recognition : theory and applications , 1990 .

[8]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[9]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[10]  Horst Bunke,et al.  Applications of approximate string matching to 2D shape recognition , 1993, Pattern Recognit..

[11]  R. C. Williamson,et al.  Classification on proximity data with LP-machines , 1999 .

[12]  Lev Goldfarb,et al.  A unified approach to pattern recognition , 1984, Pattern Recognit..

[13]  Klaus-Robert Müller,et al.  Feature Discovery in Non-Metric Pairwise Data , 2004, J. Mach. Learn. Res..

[14]  Cor J. Veenman,et al.  Turning the hyperparameter of an AUC-optimized classifier , 2005, BNAIC.

[15]  Remco C. Veltkamp,et al.  State of the Art in Shape Matching , 2001, Principles of Visual Information Retrieval.

[16]  Robert P. W. Duin,et al.  Prototype selection for dissimilarity-based classifiers , 2006, Pattern Recognit..

[17]  Daphna Weinshall,et al.  Classification with Nonmetric Distances: Image Retrieval and Class Representation , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Bernard Haasdonk,et al.  Feature space interpretation of SVMs with indefinite kernels , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.