Zero-shot Metric Learning

In this work, we tackle the zero-shot metric learning problem and propose a novel method abbreviated as ZSML, with the purpose to learn a distance metric that measures the similarity of unseen categories (even unseen datasets). ZSML achieves strong transferability by capturing multi-nonlinear yet continuous relation among data. It is motivated by two facts: 1) relations can be essentially described from various perspectives; and 2) traditional binary supervision is insufficient to represent continuous visual similarity. Specifically, we first reformulate a collection of specific-shaped convolutional kernels to combine data pairs and generate multiple relation vectors. Furthermore, we design a new cross-update regression loss to discover continuous similarity. Extensive experiments including intra-dataset transfer and inter-dataset transfer on four benchmark datasets demonstrate that ZSML can achieve state-of-the-art performance.

[1]  Suvrit Sra,et al.  Geometric Mean Metric Learning , 2016, ICML.

[2]  Ran Xu,et al.  Random forests for metric learning with implicit pairwise position dependence , 2012, KDD.

[3]  Gabriela Csurka,et al.  Distance-Based Image Classification: Generalizing to New Classes at Near-Zero Cost , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Silvio Savarese,et al.  Deep Metric Learning via Lifted Structured Feature Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[7]  Christian Szegedy,et al.  DeepPose: Human Pose Estimation via Deep Neural Networks , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[9]  Cordelia Schmid,et al.  Is that you? Metric learning approaches for face identification , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[10]  Ali Farhadi,et al.  Attribute-centric recognition for cross-category generalization , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Jiawei Han,et al.  Non-negative Matrix Factorization on Manifold , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[12]  Wei Liu,et al.  Learning Distance Metrics with Contextual Constraints for Image Retrieval , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[13]  Kilian Q. Weinberger,et al.  Distance Metric Learning for Large Margin Nearest Neighbor Classification , 2005, NIPS.

[14]  Matthieu Cord,et al.  Closed-Form Training of Mahalanobis Distance for Supervised Clustering , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Ahmed M. Elgammal,et al.  Learning Hypergraph-regularized Attribute Predictors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[17]  Chen Huang,et al.  Local Similarity-Aware Deep Feature Embedding , 2016, NIPS.

[18]  Zhi-Hua Zhou,et al.  What Makes Objects Similar: A Unified Multi-Metric Learning Approach , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  King-Sun Fu,et al.  IEEE Transactions on Pattern Analysis and Machine Intelligence Publication Information , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  P. Cochat,et al.  Et al , 2008, Archives de pediatrie : organe officiel de la Societe francaise de pediatrie.

[21]  Michael I. Jordan,et al.  Distance Metric Learning with Application to Clustering with Side-Information , 2002, NIPS.

[22]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[23]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[24]  Yann LeCun,et al.  Dimensionality Reduction by Learning an Invariant Mapping , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[25]  Dock Bumpers,et al.  Volume 2 , 2005, Proceedings of the Ninth International Conference on Computer Supported Cooperative Work in Design, 2005..