Improved Embeddings with Easy Positive Triplet Mining

Deep metric learning seeks to define an embedding where semantically similar images are embedded to nearby locations, and semantically dissimilar images are embedded to distant locations. Substantial work has focused on loss functions and strategies to learn these embeddings by pushing images from the same class as close together in the embedding space as possible. In this paper, we propose an alternative, loosened embedding strategy that requires the embedding function only map each training image to the most similar examples from the same class, an approach we call "Easy Positive" mining. We provide a collection of experiments and visualizations that highlight that this Easy Positive mining leads to embed-dings that are more flexible and generalize better to new unseen data. This simple mining strategy yields recall performance that exceeds state of the art approaches (including those with complicated loss functions and ensemble methods) on image retrieval datasets including CUB, Stanford Online Products, In-Shop Clothes and Hotels-50K. Code is available at: https://github.com/littleredxh/EasyPositiveHardNegative

[1]  Weilin Huang,et al.  Deep Metric Learning with Hierarchical Triplet Loss , 2018, ECCV.

[2]  Xiaogang Wang,et al.  DeepFashion: Powering Robust Clothes Recognition and Retrieval with Rich Annotations , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Lucas Beyer,et al.  In Defense of the Triplet Loss for Person Re-Identification , 2017, ArXiv.

[4]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Chao Zhang,et al.  Hard-Aware Deeply Cascaded Embedding , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[6]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Jungmin Lee,et al.  Attention-based Ensemble for Deep Metric Learning , 2018, ECCV.

[8]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[9]  Geoffrey E. Hinton,et al.  Neighbourhood Components Analysis , 2004, NIPS.

[10]  Manohar Paluri,et al.  Metric Learning with Adaptive Density Discrimination , 2015, ICLR.

[11]  Kaiqi Huang,et al.  Beyond Triplet Loss: A Deep Quadruplet Network for Person Re-identification , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[13]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Luca Antiga,et al.  Automatic differentiation in PyTorch , 2017 .

[17]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[18]  Robert Pless,et al.  Hotels-50K: A Global Hotel Recognition Dataset , 2019, AAAI.

[19]  Horst Possegger,et al.  BIER — Boosting Independent Embeddings Robustly , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[20]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[21]  Robert Pless,et al.  Deep Randomized Ensembles for Metric Learning , 2018, ECCV.

[22]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[23]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[24]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[25]  Liang Lin,et al.  Deep feature learning with relative distance comparison for person re-identification , 2015, Pattern Recognit..

[26]  Wei Zhang,et al.  Heated-Up Softmax Embedding , 2018, ArXiv.