Unified Generator-Classifier for Efficient Zero-Shot Learning

Generative models have achieved state-of-the-art performance for the zero-shot learning problem, but they require re-training the classifier every time a new object category is encountered. The traditional semantic embedding approaches, though very elegant, usually do not perform at par with their generative counterparts. In this work, we propose an unified framework termed GenClass, which integrates the generator with the classifier for efficient zero-shot learning, thus combining the representative power of the generative approaches and the elegance of the embedding approaches. End-to-end training of the unified framework not only eliminates the requirement of additional classifier for new object categories as in the generative approaches, but also facilitates the generation of more discriminative and useful features. Extensive evaluation on three standard zero-shot object classification datasets, namely AWA, CUB and SUN shows the effectiveness of the proposed approach. The approach without any modification, also gives state-of-the-art performance for zero-shot action classification, thus showing its generalizability to other domains.

[1]  Shaogang Gong,et al.  Unsupervised Domain Adaptation for Zero-Shot Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Hema A. Murthy,et al.  A Generative Model for Zero Shot Learning Using Conditional Variational Autoencoders , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[4]  Thomas Serre,et al.  HMDB: A large video database for human motion recognition , 2011, 2011 International Conference on Computer Vision.

[5]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Bernt Schiele,et al.  Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[8]  Piyush Rai,et al.  A Generative Approach to Zero-Shot and Few-Shot Action Recognition , 2018, 2018 IEEE Winter Conference on Applications of Computer Vision (WACV).

[9]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[10]  Piyush Rai,et al.  Generalized Zero-Shot Learning via Synthesized Examples , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[11]  Aaron C. Courville,et al.  Improved Training of Wasserstein GANs , 2017, NIPS.

[12]  Silvio Savarese,et al.  Recognizing human actions by attributes , 2011, CVPR 2011.

[13]  Mubarak Shah,et al.  UCF101: A Dataset of 101 Human Actions Classes From Videos in The Wild , 2012, ArXiv.

[14]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[15]  Cordelia Schmid,et al.  Label-Embedding for Image Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Michael I. Jordan,et al.  Generalized Zero-Shot Learning with Deep Calibration Network , 2018, NeurIPS.

[17]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Yang Yang,et al.  Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Ke Chen,et al.  Zero-Shot Visual Recognition via Bidirectional Latent Embedding , 2016, International Journal of Computer Vision.

[20]  Bernt Schiele,et al.  Feature Generating Networks for Zero-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Chih-Min Lin,et al.  Generative Adversarial Nets in Robotic Chinese Calligraphy , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[23]  Pietro Perona,et al.  The Caltech-UCSD Birds-200-2011 Dataset , 2011 .

[24]  Christoph H. Lampert,et al.  Zero-Shot Learning—A Comprehensive Evaluation of the Good, the Bad and the Ugly , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Shaogang Gong,et al.  Semantic embedding space for zero-shot action recognition , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[26]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[27]  James Hays,et al.  SUN attribute database: Discovering, annotating, and recognizing scene attributes , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[28]  Venkatesh Saligrama,et al.  Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[29]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[30]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.