论文信息 - Knowledge-Based Fine-Grained Classification For Few-Shot Learning

Knowledge-Based Fine-Grained Classification For Few-Shot Learning

The small inter-class variance and the large intra-class variance make the few-shot and fine-grained image classification more difficult because the machine cannot obtain enough information from only a few images. The external knowledge contains more semantics and can support the model to extract important features, while most of existing few-shot learning algorithms only focus on leveraging the visual features from images, little attention has been paid to the cross-modal external knowledge. In this paper, we propose a knowledge-based fine-grained classification mechanism for few-shot learning, which can overcome the difficulty of only obtaining limited and discriminative features from unimodal samples. We extract the visual features and the knowledge features from textual descriptions and a domain-specific knowledge graph at global and local levels to build the semantic space. To tackle the gap between multimodal features, we propose a mirror framework, named Mirror Mapping Network (MMN), to map the multimodal features into the same semantic space with two directions. Extensive experimental results show that our method outperforms the state-of-the-art.

[1] Jonathan Krause,et al. 3D Object Representations for Fine-Grained Categorization , 2013, 2013 IEEE International Conference on Computer Vision Workshops.

[2] Ruslan Salakhutdinov,et al. Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[3] Jinhui Tang,et al. Few-Shot Image Recognition With Knowledge Transfer , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[4] Yinda Zhang,et al. Semantic Feature Augmentation in Few-shot Learning , 2018, ArXiv.

[5] Bernt Schiele,et al. Learning Deep Representations of Fine-Grained Visual Descriptions , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Tao Xiang,et al. Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[8] Rogério Schmidt Feris,et al. Delta-encoder: an effective sample synthesis method for few-shot object recognition , 2018, NeurIPS.

[9] Joan Bruna,et al. Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[10] Tao Mei,et al. Destruction and Construction Learning for Fine-Grained Image Recognition , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[11] Richard S. Zemel,et al. Prototypical Networks for Few-shot Learning , 2017, NIPS.

[12] Wei Wang,et al. Instance-Aware Image and Sentence Matching with Selective Multimodal LSTM , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Nikos Komodakis,et al. Dynamic Few-Shot Visual Learning Without Forgetting , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[14] Nuno Vasconcelos,et al. AGA: Attribute-Guided Augmentation , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Qiang Wu,et al. Compare More Nuanced: Pairwise Alignment Bilinear Network for Few-Shot Fine-Grained Learning , 2019, 2019 IEEE International Conference on Multimedia and Expo (ICME).

[16] Oriol Vinyals,et al. Matching Networks for One Shot Learning , 2016, NIPS.

[17] Jeff Donahue,et al. Large Scale GAN Training for High Fidelity Natural Image Synthesis , 2018, ICLR.

[18] Amos Storkey,et al. Learning to Learn By Self-Critique , 2019, NeurIPS.

[19] Pietro Perona,et al. The Caltech-UCSD Birds-200-2011 Dataset , 2011 .