Applications of lp-Norms and their Smooth Approximations for Gradient Based Learning Vector Quantization

Learning vector quantization applying non-standard metrics became quite popular for classification performance improvement compared to standard approaches using the Euclidean distance. Kernel metrics and quadratic forms belong to the most promising approaches. In this paper we consider Minkowski distances (lp-norms). In particular, l1-norms are known to be robust against noise in data, such that, if this structural knowledge is available in advance about the data, this norm should be utilized. However, application in gradient based learning algorithms based on distance evaluations need to calculate the respective derivatives. Because lp-distance formulas contain the absolute approximations thereof are required. We consider in this paper several approaches for smooth consistent approximations for numerical evaluations and demonstrate the applicability for exemplary real world applications.

[1]  M. Verleysen,et al.  Generalisation of the LP norm for time series and its application to Self-Organizing Maps , 2005 .

[2]  K. Yamada,et al.  A multi-template learning method based on LVQ , 1993, IEEE International Conference on Neural Networks.

[3]  Ming Shao,et al.  Discriminative metric: Schatten norm vs. vector norm , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[4]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[5]  Barnabás Póczos,et al.  REGO: Rank-based Estimation of Renyi Information using Euclidean Graph Optimization , 2010, AISTATS.

[6]  Michael Biehl,et al.  Adaptive Relevance Matrices in Learning Vector Quantization , 2009, Neural Computation.

[7]  Thomas Villmann,et al.  Gradient Based Learning in Vector Quantization Using Differentiable Kernels , 2012, WSOM.

[8]  Andrzej Cichocki,et al.  Families of Alpha- Beta- and Gamma- Divergences: Flexible and Robust Measures of Similarities , 2010, Entropy.

[9]  S. V. Fomin,et al.  Reelle Funktionen und Funktionalanalysis , 1975 .

[10]  Thomas Villmann,et al.  Generalized relevance learning vector quantization , 2002, Neural Networks.

[11]  Yang Li,et al.  Analysis of Tiling Microarray Data by Learning Vector Quantization and Relevance Learning , 2007, IDEAL.

[12]  Thomas Villmann,et al.  Classification of mass-spectrometric data in clinical proteomics using learning vector quantization methods , 2007, Briefings Bioinform..

[13]  D. Signorini,et al.  Neural networks , 1995, The Lancet.

[14]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[15]  Thomas Villmann,et al.  Regularization in relevance learning vector quantization using l1-norms , 2013, ESANN.

[16]  Michel Verleysen,et al.  Representation of functional data in neural networks , 2005, Neurocomputing.

[17]  Stephen M. Watt,et al.  Distance-based classification of handwritten symbols , 2010, International Journal on Document Analysis and Recognition (IJDAR).

[18]  Leonid Kantorovich,et al.  Funktionalanalysis in normierten Räumen , 1978 .

[19]  Mark W. Schmidt,et al.  Fast Optimization Methods for L1 Regularization: A Comparative Study and Two New Approaches , 2007, ECML.

[20]  Thomas Villmann,et al.  Divergence-Based Vector Quantization , 2011, Neural Computation.

[21]  Stephen P. Boyd,et al.  Convex Optimization , 2004, Algorithms and Theory of Computation Handbook.

[22]  Thomas Villmann,et al.  Non-Euclidean Principal Component Analysis and Oja's Learning Rule - Theoretical Aspects , 2012, WSOM.

[23]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[24]  Thomas Villmann,et al.  Derivatives of Pearson Correlation for Gradient-based Analysis of Biomedical Data , 2008, Inteligencia Artif..