Unsupervised Parameter Estimation of Non Linear Scaling for Improved Classification in the Dissimilarity Space

The non-linear scaling of given dissimilarities, by raising them to a power in the (0,1) interval, is often useful to improve the classification performance in the corresponding dissimilarity space. The optimal value for the power can be found by a grid search across a leave-one-out cross validation of the classifier: a procedure that might become costly for large dissimilarity matrices, and is based on labels, not permitting to capture the global effect of such a scaling. Herein, we propose an entirely unsupervised criterion that, when optimized, leads to a suboptimal but often good enough value of the scaling power. The criterion is based on a trade-off between the dispersion of data in the dissimilarity space and the corresponding intrinsic dimensionality, such that the concentrating effects of the power transformation on both the space axes and the spatial distribution of the objects are rationed.

[1]  Manuele Bicego,et al.  Nonlinear Mappings for Generative Kernels on Latent Variable Models , 2010, 2010 20th International Conference on Pattern Recognition.

[2]  Bala Rajaratnam,et al.  Complete characterization of Hadamard powers preserving Loewner positivity, monotonicity, and convexity , 2013, 1311.1581.

[3]  Frans C. A. Groen,et al.  The box-cox metric for nearest neighbour classification improvement , 1997, Pattern Recognit..

[4]  Francesco Camastra,et al.  Data dimensionality estimation methods: a survey , 2003, Pattern Recognit..

[5]  Robert P. W. Duin,et al.  The Dissimilarity Representation for Pattern Recognition - Foundations and Applications , 2005, Series in Machine Perception and Artificial Intelligence.

[6]  A. A. Fahmy Using the Bees Algorithm to select the optimal speed parameters for wind turbine generators , 2012, J. King Saud Univ. Comput. Inf. Sci..

[7]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[8]  Marcello Pelillo,et al.  Similarity-Based Pattern Analysis and Recognition , 2013, Advances in Computer Vision and Pattern Recognition.

[9]  Manuele Bicego,et al.  Properties of the Box-Cox transformation for pattern classification , 2016, Neurocomputing.

[10]  Antonino Staiano,et al.  Intrinsic dimension estimation: Advances and open problems , 2016, Inf. Sci..

[11]  W. Beyer CRC Standard Probability And Statistics Tables and Formulae , 1990 .

[12]  Hiroshi Sako,et al.  Handwritten digit recognition: investigation of normalization and feature extraction techniques , 2004, Pattern Recognit..

[13]  Robert P. W. Duin,et al.  The dissimilarity space: Bridging structural and statistical pattern recognition , 2012, Pattern Recognit. Lett..

[14]  Weifeng Liu,et al.  Adaptive and Learning Systems for Signal Processing, Communication, and Control , 2010 .

[15]  Manuele Bicego,et al.  Non-linear generative embeddings for kernels on latent variable models , 2009, 2009 IEEE 12th International Conference on Computer Vision Workshops, ICCV Workshops.

[16]  Vittorio Zaccaria,et al.  OSCAR: An Optimization Methodology Exploiting Spatial Correlation in Multicore Design Spaces , 2012, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[17]  Robert P. W. Duin,et al.  Metric Learning in Dissimilarity Space for Improved Nearest Neighbor Performance , 2014, S+SSPR.