Recall@k Surrogate Loss with Large Batches and Similarity Mixup

Direct optimization, by gradient descent, of an evaluation metric, is not possible when it is non-differentiable, which is the case for recall in retrieval. In this work, a differentiable surrogate loss for the recall is proposed. Using an implementation that sidesteps the hardware constraints of the GPU memory, the method trains with a very large batch size, which is essential for metrics computed on the entire retrieval database. It is assisted by an efficient mixup approach that operates on pairwise scalar similarities and virtually increases the batch size further. When used for deep metric learning, the proposed method achieves stateof-the-art results in several image retrieval benchmarks. For instance-level recognition, the method outperforms similar approaches that train using an approximation of average precision. The implementation will be made public.

[1]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Yuning Jiang,et al.  UnitBox: An Advanced Object Detection Network , 2016, ACM Multimedia.

[3]  Ismail Elezi,et al.  Learning Intra-Batch Connections for Deep Metric Learning , 2021, ICML.

[4]  Jiri Matas,et al.  FEDS - Filtered Edit Distance Surrogate , 2021, ICDAR.

[5]  Ioannis Mitliagkas,et al.  Manifold Mixup: Better Representations by Interpolating Hidden States , 2018, ICML.

[6]  Geoffrey E. Hinton,et al.  Semantic hashing , 2009, Int. J. Approx. Reason..

[7]  Nikolay Kyurkchiev,et al.  On the approximation of the step function by some sigmoid functions , 2017, Math. Comput. Simul..

[8]  Svetoslav Markov,et al.  On the Approximation of the Cut and Step Functions by Logistic and Gompertz Functions , 2015 .

[9]  Luc Van Gool,et al.  Conditional Probability Models for Deep Image Compression , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  Pablo Piantanida,et al.  Metric learning: cross-entropy vs. pairwise losses , 2020, ArXiv.

[11]  Jiri Matas,et al.  Learning Surrogates via Deep Embedding , 2020, ECCV.

[12]  Giorgos Tolias,et al.  Learning and aggregating deep local descriptors for instance-level recognition , 2020, ECCV.

[13]  Miroslaw Bober,et al.  Siamese Network of Deep Fisher-Vector Descriptors for Image Retrieval , 2017, ArXiv.

[14]  Bhiksha Raj,et al.  SphereFace: Deep Hypersphere Embedding for Face Recognition , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Stan Sclaroff,et al.  Deep Metric Learning to Rank , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Andrew Zisserman,et al.  Smooth-AP: Smoothing the Path Towards Large-Scale Image Retrieval , 2020, ECCV.

[17]  Sergey Levine,et al.  MuProp: Unbiased Backpropagation for Stochastic Neural Networks , 2015, ICLR.

[18]  Joseph Paul Cohen,et al.  Revisiting Training Strategies and Generalization Performance in Deep Metric Learning , 2020, ICML.

[19]  R. Manmatha,et al.  Saliency Driven Perceptual Image Compression , 2020, 2021 IEEE Winter Conference on Applications of Computer Vision (WACV).

[20]  Albert Gordo,et al.  End-to-End Learning of Deep Visual Representations for Image Retrieval , 2016, International Journal of Computer Vision.

[21]  Giorgos Tolias,et al.  Fine-Tuning CNN Image Retrieval with No Human Annotation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Hao-Yu Wu,et al.  Classification is a Strong Baseline for Deep Metric Learning , 2018, BMVC.

[23]  Geonmo Gu,et al.  Proxy Synthesis: Learning with Synthetic Classes for Deep Metric Learning , 2021, AAAI.

[24]  David Stutz,et al.  Neural Codes for Image Retrieval , 2015 .

[25]  Qi Qian,et al.  SoftTriple Loss: Deep Metric Learning Without Triplet Sampling , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[26]  Weilin Huang,et al.  Cross-Batch Memory for Embedding Learning , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Jon Almazán,et al.  Learning With Average Precision: Training Image Retrieval With a Listwise Loss , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[28]  Graham W. Taylor,et al.  ProxyNCA++: Revisiting and Revitalizing Proxy Neighborhood Component Analysis , 2020, ECCV.

[29]  Victor S. Lempitsky,et al.  Learning Deep Embeddings with Histogram Loss , 2016, NIPS.

[30]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[31]  Jonathan Krause,et al.  3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[32]  Gattigorla Nagendar,et al.  Neuro-IoU: Learning a Surrogate Loss for Semantic Segmentation , 2018, BMVC.

[33]  Yannis Kalantidis,et al.  Hard Negative Mixing for Contrastive Learning , 2020, NeurIPS.

[34]  Alexander J. Smola,et al.  Sampling Matters in Deep Embedding Learning , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[35]  Matthew R. Scott,et al.  Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[36]  Yee Whye Teh,et al.  The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables , 2016, ICLR.

[37]  Silvio Savarese,et al.  Generalized Intersection Over Union: A Metric and a Loss for Bounding Box Regression , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Yannis Avrithis,et al.  Revisiting Oxford and Paris: Large-Scale Image Retrieval Benchmarking , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[40]  Yannis Avrithis,et al.  It Takes Two to Tango: Mixup for Deep Metric Learning , 2021, ArXiv.

[41]  Grant Van Horn,et al.  The iNaturalist Species Classification and Detection Dataset-Supplementary Material , 2018 .

[42]  Kihyuk Sohn,et al.  Improved Deep Metric Learning with Multi-class N-pair Loss Objective , 2016, NIPS.

[43]  Kyurkchiev Nikolay,et al.  Sigmoid Functions: Some Approximation and Modelling Aspects , 2015 .

[44]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[45]  Yan Lu,et al.  Local Descriptors Optimized for Average Precision , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[46]  David Minnen,et al.  Variational image compression with a scale hyperprior , 2018, ICLR.

[47]  Geonmo Gu,et al.  Symmetrical Synthesis for Deep Metric Learning , 2020, AAAI.

[48]  Jian Wang,et al.  Deep Metric Learning with Angular Loss , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[49]  Karsten Roth,et al.  MIC: Mining Interclass Characteristics for Improved Metric Learning , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[50]  Xing Ji,et al.  CosFace: Large Margin Cosine Loss for Deep Face Recognition , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[51]  Marcello Pelillo,et al.  The Group Loss for Deep Metric Learning , 2020, ECCV.

[52]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[53]  Claudio Michaelis,et al.  Optimizing Rank-Based Metrics With Blackbox Differentiation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[54]  Yang Song,et al.  The iNaturalist Species Classification and Detection Dataset , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[55]  Joelle Pineau,et al.  An Actor-Critic Algorithm for Sequence Prediction , 2016, ICLR.

[56]  Yair Movshovitz-Attias,et al.  No Fuss Distance Metric Learning Using Proxies , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[57]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[58]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Aymeric Histace,et al.  Metric Learning With HORDE: High-Order Regularizer for Deep Embeddings , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[60]  Bohyung Han,et al.  Large-Scale Image Retrieval with Attentive Deep Local Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[61]  Hongyi Zhang,et al.  mixup: Beyond Empirical Risk Minimization , 2017, ICLR.

[62]  Stefanos Zafeiriou,et al.  ArcFace: Additive Angular Margin Loss for Deep Face Recognition , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[63]  Ser-Nam Lim,et al.  A Metric Learning Reality Check , 2020, ECCV.

[64]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).