Pseudo distribution on unseen classes for generalized zero shot learning

Abstract Although Zero Shot Learning (ZSL) has attracted more and more attention due to its powerful ability of recognizing new objects without retraining, it has a serious drawback that it only focuses on unseen classes during prediction. To solve this issue, Generalized ZSL (GZSL) extends the search range to both seen and unseen classes, which makes it a more realistic and challenging task. Conventional methods on GZSL often suffer from the domain shift problem on seen classes because they have only seen data for training. Deep Calibration Network (DCN) tries to minimize the entropy of assigning seen data to unseen classes to balance the training on both seen and unseen classes. However, there are still two problems for DCN, one is the hubness problem and another is the lack of training guidance. In this paper, to solve the two problems, we propose a novel method called PSeudo Distribution (PSD), which exploits the attribute similarity between seen classes and unseen classes as the training guidance to assign the seen data to unseen classes. In addition, the attribute similarity is also compressed to one-hot vector to further encourage the certainty of the model. Besides, the visual space is utilized as the embedding space, which can well settle the hubness problem. Extensive experiments are conducted on four popular datasets, and the results show the superiority of the proposed method.

[1]  David A. Forsyth,et al.  Describing objects by their attributes , 2009, CVPR.

[2]  Alberto Del Bimbo,et al.  Webly-supervised zero-shot learning for artwork instance recognition , 2019, Pattern Recognit. Lett..

[3]  Zhongfei Zhang,et al.  Stacked Semantic-Guided Attention Model for Fine-Grained Zero-Shot Learning , 2018, ArXiv.

[4]  Venkatesh Saligrama,et al.  Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Cordelia Schmid,et al.  Label-Embedding for Attribute-Based Classification , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[7]  Ling Shao,et al.  Dual-verification network for zero-shot learning , 2019, Inf. Sci..

[8]  Philip S. Yu,et al.  Generative Dual Adversarial Network for Generalized Zero-Shot Learning , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Ling Shao,et al.  Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder , 2018, IJCAI.

[10]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[11]  Jungong Han,et al.  Attribute-Guided Network for Cross-Modal Zero-Shot Hashing , 2018, IEEE Transactions on Neural Networks and Learning Systems.

[12]  Yoshua Bengio,et al.  Generative Adversarial Nets , 2014, NIPS.

[13]  Ming Shao,et al.  Low-Rank Embedded Ensemble Semantic Dictionary for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Hongguang Zhang,et al.  Zero-Shot Kernel Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Ling Shao,et al.  From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[17]  Xuelong Li,et al.  Video Summarization With Attention-Based Encoder–Decoder Networks , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[18]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Christoph H. Lampert,et al.  Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[20]  Yue Gao,et al.  Zero-Shot Learning With Transferred Samples , 2017, IEEE Transactions on Image Processing.

[21]  Cordelia Schmid,et al.  Label-Embedding for Image Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[22]  Xi Peng,et al.  A Generative Adversarial Approach for Zero-Shot Learning from Noisy Texts , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Yang Yang,et al.  Zero-shot learning via discriminative representation extraction , 2017, Pattern Recognit. Lett..

[24]  Zi Huang,et al.  Leveraging the Invariant Side of Generative Zero-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Ling Shao,et al.  Adversarial unseen visual feature synthesis for Zero-shot Learning , 2019, Neurocomputing.

[26]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Tao Xiang,et al.  Learning a Deep Embedding Model for Zero-Shot Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28]  Yuji Matsumoto,et al.  Ridge Regression, Hubness, and Zero-Shot Learning , 2015, ECML/PKDD.

[29]  Chen Xu,et al.  The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding , 2014, International Journal of Computer Vision.

[30]  Ling Shao,et al.  Label-activating framework for zero-shot learning , 2020, Neural Networks.

[31]  Piyush Rai,et al.  A Simple Exponential Family Framework for Zero-Shot Learning , 2017, ECML/PKDD.

[32]  Ling Shao,et al.  Triple Verification Network for Generalized Zero-Shot Learning , 2019, IEEE Transactions on Image Processing.

[33]  Rama Chellappa,et al.  Zero-Shot Object Detection , 2018, ECCV.

[34]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[35]  Ling Shao,et al.  Zero-shot Hashing with orthogonal projection for image retrieval , 2019, Pattern Recognit. Lett..

[36]  Michael I. Jordan,et al.  Generalized Zero-Shot Learning with Deep Calibration Network , 2018, NeurIPS.

[37]  Luc Van Gool,et al.  The Pascal Visual Object Classes (VOC) Challenge , 2010, International Journal of Computer Vision.

[38]  Bernt Schiele,et al.  Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[39]  Shiguang Shan,et al.  Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition , 2018, ECCV.

[40]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[41]  Wei-Lun Chao,et al.  An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild , 2016, ECCV.

[42]  Heng Tao Shen,et al.  Pseudo Transfer with Marginalized Corrupted Attribute for Zero-shot Learning , 2018, ACM Multimedia.

[43]  Bin Wang,et al.  Implicit Non-linear Similarity Scoring for Recognizing Unseen Classes , 2018, IJCAI.

[44]  Kai Fan,et al.  Zero-Shot Learning via Class-Conditioned Deep Generative Models , 2017, AAAI.

[45]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[46]  Soma Biswas,et al.  Preserving Semantic Relations for Zero-Shot Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[47]  Zhongfei Zhang,et al.  Zero-Shot Learning via Latent Space Encoding , 2017, IEEE Transactions on Cybernetics.

[48]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[49]  Wei-Lun Chao,et al.  Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[50]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .