Deep Metric Learning with False Positive Probability - Trade Off Hard Levels in a Weighted Way

In recent years, deep metric learning has been an end-to-end fashion in computer vision community due to the great success of deep learning. However, existing deep metric learning frameworks are faced with a dilemma about the hard level trade-off for training examples. Namely, the “harder” examples we feed to neural networks, the more likely we attain highly discriminative models, but the more easily neural networks get stuck into poor local minimal in practice. To fight against this dilemma, we propose a deep metric learning method with FAlse Positive ProbabilitY (FAPPY) to gradually incorporate different hard levels. Unlike mainstream deep metric learning schemes, the presented approach optimizes similarity probability distribution among training samples, instead of the similarity itself. Experimental results on CUB-200-2011, Stanford Online Products and VehicleID datasets show that our FAPPY method achieves or outperforms state-of-the-art metric learning methods on fine-grained image retrieval and vehicle re-identification tasks. Besides, the presented method has relatively low sensitivity of hyper-parameters and it requires minor changes on traditional classification networks.

[1]  Stefanie Jegelka,et al.  Deep Metric Learning via Facility Location , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Hai Tao,et al.  Evaluating Appearance Models for Recognition, Reacquisition, and Tracking , 2007 .

[3]  Frédéric Jurie,et al.  Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication , 2016, ECCV.

[4]  Feng Zhou,et al.  Fine-Grained Categorization and Dataset Bootstrapping Using Deep Metric Learning with Humans in the Loop , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[6]  Victor S. Lempitsky,et al.  Learning Deep Embeddings with Histogram Loss , 2016, NIPS.

[7]  Pietro Perona,et al.  Caltech-UCSD Birds 200 , 2010 .

[8]  Yann LeCun,et al.  Learning a similarity metric discriminatively, with application to face verification , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[9]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Chao Zhang,et al.  Hard-Aware Deeply Cascaded Embedding , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[12]  Yu Qiao,et al.  A Discriminative Feature Learning Approach for Deep Face Recognition , 2016, ECCV.

[13]  Cordelia Schmid,et al.  Product Quantization for Nearest Neighbor Search , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Tiejun Huang,et al.  Deep Relative Distance Learning: Tell the Difference between Similar Vehicles , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Andrew Zisserman,et al.  Return of the Devil in the Details: Delving Deep into Convolutional Nets , 2014, BMVC.

[16]  Xiang Li,et al.  Top-Push Video-Based Person Re-identification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[18]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Feng Zhou,et al.  Embedding Label Structures for Fine-Grained Feature Representation , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  Venkatesh Saligrama,et al.  Zero-Shot Learning via Joint Latent Similarity Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).