Distance Metric Learning with Joint Representation Diversification

Distance metric learning (DML) is to learn a representation space equipped with a metric, such that similar examples are closer than dissimilar examples concerning the metric. The recent success of DNNs motivates many DML losses that encourage the intra-class compactness and inter-class separability. The trade-off between inter-class compactness and inter-class separability shapes the DML representation space by determining how much information of the original inputs to retain. In this paper, we propose a Distance Metric Learning with Joint Representation Diversification (JRD) that allows a better balancing point between intra-class compactness and inter-class separability. Specifically, we propose a Joint Representation Similarity regularizer that captures different abstract levels of invariant features and diversifies the joint distributions of representations across multiple layers. Experiments on three deep DML benchmark datasets demonstrate the effectiveness of the proposed approach.

[1]  Victor S. Lempitsky,et al.  Learning Deep Embeddings with Histogram Loss , 2016, NIPS.

[2]  Bernhard Schölkopf,et al.  A Kernel Method for the Two-Sample-Problem , 2006, NIPS.

[3]  Le Song,et al.  A Hilbert Space Embedding for Distributions , 2007, Discovery Science.

[4]  Björn Ommer,et al.  Divide and Conquer the Embedding Space for Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Shih-Fu Chang,et al.  Semi-Supervised Hashing for Large-Scale Search , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Matthew R. Scott,et al.  Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Karsten Roth,et al.  MIC: Mining Interclass Characteristics for Improved Metric Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[9]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10]  Richard S. Zemel,et al.  Generative Moment Matching Networks , 2015, ICML.

[11]  Michael I. Jordan,et al.  Learning Transferable Features with Deep Adaptation Networks , 2015, ICML.

[12]  Chen Huang,et al.  Local Similarity-Aware Deep Feature Embedding , 2016, NIPS.

[13]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[14]  Weihong Deng,et al.  Energy Confused Adversarial Metric Learning for Zero-Shot Image Retrieval and Clustering , 2019, AAAI.

[15]  Koby Crammer,et al.  A theory of learning from different domains , 2010, Machine Learning.

[16]  Jiwen Lu,et al.  Hardness-Aware Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Pengtao Xie,et al.  Diversifying Restricted Boltzmann Machine for Document Modeling , 2015, KDD.

[18]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Pengtao Xie,et al.  Uncorrelation and Evenness: a New Diversity-Promoting Regularizer , 2017, ICML.

[20]  Meng Yang,et al.  Large-Margin Softmax Loss for Convolutional Neural Networks , 2016, ICML.

[21]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[22]  A. Berlinet,et al.  Reproducing kernel Hilbert spaces in probability and statistics , 2004 .

[23]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[24]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[25]  Jason Weston,et al.  WSABIE: Scaling Up to Large Vocabulary Image Annotation , 2011, IJCAI.

[26]  Inderjit S. Dhillon,et al.  Information-theoretic metric learning , 2006, ICML '07.

[27]  Bernhard Schölkopf,et al.  Learning with Local and Global Consistency , 2003, NIPS.

[28]  Michael I. Jordan,et al.  Deep Transfer Learning with Joint Adaptation Networks , 2016, ICML.

[29]  Le Song,et al.  Robust Low Rank Kernel Embeddings of Multivariate Distributions , 2013, NIPS.

[30]  Weihong Deng,et al.  Hybrid-Attention Based Decoupled Metric Learning for Zero-Shot Image Retrieval , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31]  Qi Qian,et al.  SoftTriple Loss: Deep Metric Learning Without Triplet Sampling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[32]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[34]  Ross B. Girshick,et al.  Reducing Overfitting in Deep Networks by Decorrelating Representations , 2015, ICLR.

[35]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[37]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[38]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[39]  M. Urner Scattered Data Approximation , 2016 .

[40]  Cheng Deng,et al.  Deep Asymmetric Metric Learning via Rich Relationship Mining , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Jiwen Lu,et al.  Deep Embedding Learning With Discriminative Sampling Policy , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Jing Lu,et al.  Sampling Wisely: Deep Image Embedding by Top-K Precision Optimization , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[43]  Miguel Á. Carreira-Perpiñán,et al.  An ensemble diversity approach to supervised binary hashing , 2016, NIPS.

[44]  Arthur Gretton,et al.  Learning deep kernels for exponential family densities , 2018, ICML.

[45]  Bernhard Schölkopf,et al.  Towards a Learning Theory of Causation , 2015, 1502.02398.

[46]  Yiming Yang,et al.  MMD GAN: Towards Deeper Understanding of Moment Matching Network , 2017, NIPS.

[47]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[48]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49]  Rongrong Ji,et al.  Low-Rank Similarity Metric Learning in High Dimensions , 2015, AAAI.

[50]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[51]  Aymeric Histace,et al.  Metric Learning With HORDE: High-Order Regularizer for Deep Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[52]  Gang Niu,et al.  Information-Theoretic Semi-Supervised Metric Learning via Entropy Regularization , 2012, Neural Computation.

[53]  Ryan P. Adams,et al.  Priors for Diversity in Generative Latent Variable Models , 2012, NIPS.

[54]  Yang Hua,et al.  Ranked List Loss for Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[55]  Chao Li,et al.  Shared Predictive Cross-Modal Deep Quantization , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[56]  Yong Chen,et al.  Diversity Regularized Latent Semantic Match for Hashing , 2017, Neurocomputing.

[57]  C. Baker Joint measures and cross-covariance operators , 1973 .

[58]  Michael I. Jordan,et al.  Transferable Adversarial Training: A General Approach to Adapting Deep Classifiers , 2019, ICML.

[59]  Bernhard Schölkopf,et al.  A Kernel Two-Sample Test , 2012, J. Mach. Learn. Res..

[60]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[61]  Pengtao Xie Learning Compact and Effective Distance Metrics with Diversity Regularization , 2015, ECML/PKDD.

[62]  Bernhard Schölkopf,et al.  Kernel Mean Embedding of Distributions: A Review and Beyonds , 2016, Found. Trends Mach. Learn..

[63]  Gustavo Carneiro,et al.  Smart Mining for Deep Metric Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[64]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[65]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.