Label-activating framework for zero-shot learning

Existing zero-shot learning (ZSL) models usually learn mappings between visual space and semantic space. However, few of them take the label information into account. Indirect Attribute Prediction (IAP) learns the posterior probability of each attribute by label space, but labels of seen and unseen classes are defined in different spaces, which is not suitable for Generalized ZSL (GZSL). We propose a Label-Activating Framework (LAF) for semantic-based classification. The purpose of the proposed framework is to activate the label space by learning mappings from vision and semantics to labels. In the training phase, the original label space made up of one-hot vectors is used as common space, on which visual features and semantic information are embedded. After the label space is activated, labels of unseen classes can be regarded as the linear combination of labels of seen classes. In this case, seen and unseen labels are defined in the same space, and the label space has specific meanings rather than only signs of each class. Doing so makes the activated label space become very discriminative, especially for GZSL, which is therefore more challenging and reasonable for real-world tasks. In addition, we develop a specific model based on the framework, which effectively mitigate the projection domain shift problem. Extensive experiments show our framework outperforms state-of-the-art methods and also its suitability for GZSL.

[1]  Yi Ma,et al.  Robust principal component analysis? , 2009, JACM.

[2]  Hongguang Zhang,et al.  Zero-Shot Kernel Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Soma Biswas,et al.  Preserving Semantic Relations for Zero-Shot Learning , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[4]  Philip H. S. Torr,et al.  An embarrassingly simple approach to zero-shot learning , 2015, ICML.

[5]  Christoph H. Lampert,et al.  Learning to detect unseen object classes by between-class attribute transfer , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Christoph H. Lampert,et al.  Attribute-Based Classification for Zero-Shot Visual Object Categorization , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Yang Yang,et al.  Matrix Tri-Factorization with Manifold Regularizations for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Yuji Matsumoto,et al.  Ridge Regression, Hubness, and Zero-Shot Learning , 2015, ECML/PKDD.

[9]  Shiguang Shan,et al.  Learning Class Prototypes via Structure Alignment for Zero-Shot Recognition , 2018, ECCV.

[10]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Wei-Lun Chao,et al.  An Empirical Study and Analysis of Generalized Zero-Shot Learning for Object Recognition in the Wild , 2016, ECCV.

[12]  Wei-Lun Chao,et al.  Synthesized Classifiers for Zero-Shot Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Qinghua Hu,et al.  Flexible Multi-View Dimensionality Co-Reduction , 2017, IEEE Transactions on Image Processing.

[14]  Wei Liu,et al.  Zero-Shot Visual Recognition Using Semantics-Preserving Adversarial Embedding Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Yang Liu,et al.  Graph and Autoencoder Based Feature Extraction for Zero-shot Learning , 2019, IJCAI.

[16]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[17]  David A. Forsyth,et al.  Describing objects by their attributes , 2009, CVPR.

[18]  Geoffrey E. Hinton,et al.  Zero-shot Learning with Semantic Output Codes , 2009, NIPS.

[19]  Ke Chen,et al.  Zero-Shot Visual Recognition via Bidirectional Latent Embedding , 2016, International Journal of Computer Vision.

[20]  Richard H. Bartels,et al.  Algorithm 432 [C2]: Solution of the matrix equation AX + XB = C [F4] , 1972, Commun. ACM.

[21]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[22]  Venkatesh Saligrama,et al.  Zero-Shot Learning via Joint Latent Similarity Embedding , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Kristen Grauman,et al.  Decorrelating Semantic Visual Attributes by Resisting the Urge to Share , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Xinlei Chen,et al.  NEIL: Extracting Visual Knowledge from Web Data , 2013, 2013 IEEE International Conference on Computer Vision.

[25]  Jason Weston,et al.  A kernel method for multi-labelled classification , 2001, NIPS.

[26]  Bernt Schiele,et al.  Evaluation of output embeddings for fine-grained image classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[27]  Chen Xu,et al.  The SUN Attribute Database: Beyond Categories for Deeper Scene Understanding , 2014, International Journal of Computer Vision.

[28]  Piyush Rai,et al.  A Simple Exponential Family Framework for Zero-Shot Learning , 2017, ECML/PKDD.

[29]  Xirong Li,et al.  Cross-Class Sample Synthesis for Zero-shot Learning , 2018, BMVC.

[30]  I. Biederman Recognition-by-components: a theory of human image understanding. , 1987, Psychological review.

[31]  Chen Gong,et al.  Exploring Commonality and Individuality for Multi-Modal Curriculum Learning , 2017, AAAI.

[32]  Jian Yang,et al.  Learning with Inadequate and Incorrect Supervision , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[33]  Qiang Ni,et al.  Joint Image-Text Hashing for Fast Large-Scale Cross-Media Retrieval Using Self-Supervised Deep Learning , 2019, IEEE Transactions on Industrial Electronics.

[34]  Emmanuel J. Candès,et al.  A Singular Value Thresholding Algorithm for Matrix Completion , 2008, SIAM J. Optim..

[35]  Tao Xiang,et al.  Joint Semantic and Latent Attribute Modelling for Cross-Class Transfer Learning , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[36]  Qinghua Hu,et al.  Generalized Latent Multi-View Subspace Clustering , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[37]  Bernt Schiele,et al.  Latent Embeddings for Zero-Shot Classification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[38]  Kang Sun,et al.  Non-negative matrix factorization based unmixing for principal component transformed hyperspectral data , 2016, Frontiers of Information Technology & Electronic Engineering.

[39]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[40]  Venkatesh Saligrama,et al.  Zero-Shot Learning via Semantic Similarity Embedding , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[41]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[42]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[43]  Yuri Owechko,et al.  Zero Shot Learning via Multi-scale Manifold Regularization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[44]  Cordelia Schmid,et al.  Label-Embedding for Image Classification , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[45]  Kun Zhan,et al.  Graph Learning for Multiview Clustering , 2018, IEEE Transactions on Cybernetics.

[46]  Ling Shao,et al.  Zero Shot Learning via Low-rank Embedded Semantic AutoEncoder , 2018, IJCAI.

[47]  Ling Shao,et al.  Unsupervised Deep Video Hashing via Balanced Code for Large-Scale Video Retrieval , 2019, IEEE Transactions on Image Processing.

[48]  Shaogang Gong,et al.  Unsupervised Domain Adaptation for Zero-Shot Learning , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[49]  Shaogang Gong,et al.  Semantic Autoencoder for Zero-Shot Learning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).