Metric Learning via Penalized Optimization

Metric learning aims to project original data into a new space, where data points can be classified more accurately using kNN or similar types of classification algorithms. To avoid trivial learning results such as indistinguishably projecting the data onto a line, many existing approaches formulate metric learning as a constrained optimization problem, like finding a metric that minimizes the distance between data points from the same class, with a constraint of ensuring a certain separation for data points from different classes, and then they approximate the optimal solution to the constrained optimization in an iterative way. In order to improve the classification accuracy as much as possible, we try to find a metric that is able to minimize the intra-class distance and maximize the inter-class distance simultaneously. Towards this, we formulate metric learning as a penalized optimization problem, and provide design guideline, paradigms with a general formula, as well as two representative instantiations for the penalty term. In addition, we provide an analytical solution for the penalized optimization, with which costly computation can be avoid, and more importantly, there is no need to worry about the convergence rates or approximation ratios any more. Extensive experiments on real-world data sets are conducted, and the results verify the effectiveness and efficiency of our approach.

[1]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.

[2]  Masashi Sugiyama,et al.  Dimensionality Reduction of Multimodal Labeled Data by Local Fisher Discriminant Analysis , 2007, J. Mach. Learn. Res..

[3]  Stan Sclaroff,et al.  Deep Metric Learning to Rank , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Tomer Hertz,et al.  Learning Distance Functions using Equivalence Relations , 2003, ICML.

[5]  V. Kshirsagar,et al.  Face recognition using Eigenfaces , 2011, 2011 3rd International Conference on Computer Research and Development.

[6]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[7]  Jiwen Lu,et al.  Discriminative Deep Metric Learning for Face Verification in the Wild , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Marc Sebban,et al.  A Survey on Metric Learning for Feature Vectors and Structured Data , 2013, ArXiv.

[9]  Gabriela Csurka,et al.  Metric Learning for Large Scale Image Classification: Generalizing to New Classes at Near-Zero Cost , 2012, ECCV.

[10]  Rong Jin,et al.  An Information Geometry Approach for Distance Metric Learning , 2009, AISTATS.

[11]  Xin Gao,et al.  ProDis-ContSHC: learning protein dissimilarity measures and hierarchical context coherently for protein-protein comparison in protein database retrieval , 2012, BMC Bioinformatics.

[12]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[13]  Kourosh Zarringhalam,et al.  A Free Energy Based Approach for Distance Metric Learning , 2019, KDD.

[14]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[15]  Richard S. Zemel,et al.  Stochastic k-Neighborhood Selection for Supervised and Unsupervised Learning , 2013, ICML.

[16]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[17]  Vinod Nair,et al.  Learning hierarchical similarity metrics , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Gert R. G. Lanckriet,et al.  Robust Structural Metric Learning , 2013, ICML.

[19]  J. Kruskal Multidimensional scaling by optimizing goodness of fit to a nonmetric hypothesis , 1964 .

[20]  Jianping Gou,et al.  Locality-Based Discriminant Neighborhood Embedding , 2013, Comput. J..

[21]  Shuicheng Yan,et al.  Neighborhood preserving embedding , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  Xiaofei He,et al.  Locality Preserving Projections , 2003, NIPS.

[23]  Fernando Pérez-Cruz,et al.  Support vector classifier with hyperbolic tangent penalty function , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[24]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[25]  Peng Li,et al.  Distance Metric Learning with Eigenvalue Optimization , 2012, J. Mach. Learn. Res..

[26]  Cordelia Schmid,et al.  TagProp: Discriminative metric learning in nearest neighbor models for image auto-annotation , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[27]  Tat-Seng Chua,et al.  An efficient sparse metric learning in high-dimensional space via l1-penalized log-determinant regularization , 2009, ICML '09.

[28]  Anton van den Hengel,et al.  Non-sparse linear representations for visual tracking with online reservoir metric learning , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[30]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[31]  Matthew R. Scott,et al.  Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).