论文信息 - Feature Generating Networks for Zero-Shot Learning

Feature Generating Networks for Zero-Shot Learning

Suffering from the extreme training data imbalance between seen and unseen classes, most of existing state-of-the-art approaches fail to achieve satisfactory results for the challenging generalized zero-shot learning task. To circumvent the need for labeled examples of unseen classes, we propose a novel generative adversarial network (GAN) that synthesizes CNN features conditioned on class-level semantic information, offering a shortcut directly from a semantic descriptor of a class to a class-conditional feature distribution. Our proposed approach, pairing a Wasserstein GAN with a classification loss, is able to generate sufficiently discriminative CNN features to train softmax classifiers or any multimodal embedding method. Our experimental results demonstrate a significant boost in accuracy over the state of the art on five challenging datasets - CUB, FLO, SUN, AWA and ImageNet - in both the zero-shot learning and generalized zero-shot learning settings.

[1] Frédéric Jurie,et al. Improving Semantic Embedding Consistency by Metric Learning for Zero-Shot Classiffication , 2016, ECCV.

[2] Aaron C. Courville,et al. Improved Training of Wasserstein GANs , 2017, NIPS.

[3] Wei-Lun Chao,et al. Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Richard S. Zemel,et al. Generative Moment Matching Networks , 2015, ICML.

[5] Bernt Schiele,et al. Zero-Shot Learning — The Good, the Bad and the Ugly , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Andrew Zisserman,et al. Automated Flower Classification over a Large Number of Classes , 2008, 2008 Sixth Indian Conference on Computer Vision, Graphics & Image Processing.

[7] Shaogang Gong,et al. Zero-shot object recognition by semantic manifold distance , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8] Philip H. S. Torr,et al. An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[9] Bernt Schiele,et al. Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Xiaodong Yu,et al. Attribute-Based Transfer Learning for Object Categorization with Zero/One Training Example , 2010, ECCV.

[11] Bernt Schiele,et al. Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[12] Anton van den Hengel,et al. Less is More: Zero-Shot Learning from Online Textual Documents with Noise Suppression , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Li Fei-Fei,et al. ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[14] Bernt Schiele,et al. Evaluating knowledge transfer and zero-shot learning in a large-scale setting , 2011, CVPR 2011.

[15] Nitesh V. Chawla,et al. SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[16] Christoph H. Lampert,et al. Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17] Dimitris N. Metaxas,et al. StackGAN: Text to Photo-Realistic Image Synthesis with Stacked Generative Adversarial Networks , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[18] Bernt Schiele,et al. Multi-cue Zero-Shot Learning with Strong Supervision , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Léon Bottou,et al. Towards Principled Methods for Training Generative Adversarial Networks , 2017, ICLR.

[20] Abhinav Gupta,et al. Generative Image Modeling Using Style and Structure Adversarial Networks , 2016, ECCV.

[21] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[22] Pietro Perona,et al. Caltech-UCSD Birds 200 , 2010 .

[23] Samy Bengio,et al. Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[24] Bernt Schiele,et al. Learning What and Where to Draw , 2016, NIPS.

[25] Tao Xiang,et al. Learning a Deep Embedding Model for Zero-Shot Learning , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Dale Schuurmans,et al. Semi-Supervised Zero-Shot Classification with Label Representation Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[27] Vladlen Koltun,et al. Photographic Image Synthesis with Cascaded Refinement Networks , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[28] Tsuyoshi Murata,et al. {m , 1934, ACML.

[29] Bharath Hariharan,et al. Low-Shot Visual Recognition by Shrinking and Hallucinating Features , 2016, 2017 IEEE International Conference on Computer Vision (ICCV).

[30] Frédéric Jurie,et al. Generating Visual Representations for Zero-Shot Classification , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[31] Bernt Schiele,et al. Generative Adversarial Text to Image Synthesis , 2016, ICML.

[32] Yanwei Fu,et al. Semi-supervised Vocabulary-Informed Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Shaogang Gong,et al. Unsupervised Domain Adaptation for Zero-Shot Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[34] Wojciech Zaremba,et al. Improved Techniques for Training GANs , 2016, NIPS.

[35] Léon Bottou,et al. Wasserstein GAN , 2017, ArXiv.

[36] XiangTao,et al. Transductive Multi-View Zero-Shot Learning , 2015 .

[37] Bernt Schiele,et al. Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38] Bernt Schiele,et al. Transfer Learning in a Transductive Setting , 2013, NIPS.

[39] Yoshua Bengio,et al. Zero-data Learning of New Tasks , 2008, AAAI.

[40] Pieter Abbeel,et al. InfoGAN: Interpretable Representation Learning by Information Maximizing Generative Adversarial Nets , 2016, NIPS.

[41] Simon Osindero,et al. Conditional Generative Adversarial Nets , 2014, ArXiv.

[42] Andrew Y. Ng,et al. Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[43] James Hays,et al. SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[44] Venkatesh Saligrama,et al. Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[45] Marc'Aurelio Ranzato,et al. DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[46] Ahmed M. Elgammal,et al. Learning Hypergraph-regularized Attribute Predictors , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[47] Soumith Chintala,et al. Unsupervised Representation Learning with Deep Convolutional Generative Adversarial Networks , 2015, ICLR.

[48] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[49] Wei-Lun Chao,et al. An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild , 2016, ECCV.

[50] Yoshua Bengio,et al. Generative Adversarial Nets , 2014, NIPS.