Cross-Class Sample Synthesis for Zero-shot Learning

Zero-shot learning (ZSL) aims to recognize unseen classes which have no available training samples, through establishing an association with seen classes. Existing approaches mostly learn a comparability function to predict the class of an image. Different from previous approaches, we put forward a novel method, Cross-Class Sample Synthesis (CCSS), to directly synthesize samples of unseen classes from specific seen classes in the visual feature space. We adopt class-graph to measure inter-class similarity and propose class entropy to select classes as the synthesis source of target classes. An endto-end network is constructed to realize sample synthesis from source classes to target classes. Specially, rule of attribute guiding cross-class transfer is built into the network, to which various samples of different source classes can be used to synthesize samples of each target class according. The synthesized samples are used as training data of unseen classes and it turns ZSL into a supervised learning problem. Experiments on five benchmark datasets efficiently demonstrate the advantage of our proposed method.

[1]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Kristen Grauman,et al.  Zero-shot recognition with unreliable attributes , 2014, NIPS.

[3]  Bernt Schiele,et al.  Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Sanja Fidler,et al.  Predicting Deep Zero-Shot Convolutional Neural Networks Using Textual Descriptions , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[5]  Wei-Lun Chao,et al.  Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[7]  Yue Gao,et al.  Synthesizing Samples for Zero-shot Learning , 2017, IJCAI.

[8]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[9]  Wei-Lun Chao,et al.  Predicting Visual Exemplars of Unseen Classes for Zero-Shot Learning , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[10]  Christoph H. Lampert,et al.  Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[12]  Qian Song,et al.  Zero-Shot Learning of SAR Target Feature Space With Deep Generative Neural Networks , 2017, IEEE Geoscience and Remote Sensing Letters.

[13]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[17]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[18]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[21]  Venkatesh Saligrama,et al.  Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[22]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[23]  Xiaoyong Du,et al.  Zero-shot Image Tagging by Hierarchical Semantic Embedding , 2015, SIGIR.

[24]  Ali Farhadi,et al.  Describing objects by their attributes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Xirong Li,et al.  Imagination Based Sample Construction for Zero-Shot Learning , 2018, SIGIR.

[26]  Ling Shao,et al.  From Zero-Shot Learning to Conventional Supervised Classification: Unseen Visual Data Synthesis , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).