Frog-GNN: Multi-perspective aggregation based graph neural network for few-shot text classification

Abstract Few-shot text classification aims to learn a classifier from very few labeled text instances per class. Previous few-shot research works in NLP are mainly based on Prototypical Networks, which encode support set samples of each class to prototype representations and compute distance between query and each class prototype. In the prototype aggregation progress, much useful information of support set and discrepancy between samples from different class are ignored. In contrast, our model focuses on all query-support pairs without information loss. In this paper, we propose a multi-perspective aggregation based graph neural network (Frog-GNN) that observes through eyes (support and query instance) and speaks by mouth (pair) for few-shot text classification. We construct a graph by pre-trained pair representations and aggregate information from neighborhoods by instance-level representations for message-passing. The final relational features of pairs imply intra-class similarity and inter-class dissimilarity after iteratively interactions among instances. In addition, our Frog-GNN with meta-learning strategy can well generalize to unseen class. Experimental results demonstrate that the proposed GNN model outperforms existing few-shot approaches in both few-shot text classification and relation classification on three benchmark datasets.

[1]  Jian Sun,et al.  Induction Networks for Few-Shot Text Classification , 2019, EMNLP/IJCNLP.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Stan Matwin,et al.  Attentive Task-Agnostic Meta-Learning for Few-Shot Text Classification , 2018 .

[4]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[5]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[6]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[7]  Zhen-Hua Ling,et al.  Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification , 2019, ACL.

[8]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[9]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Razvan Pascanu,et al.  Interaction Networks for Learning about Objects, Relations and Physics , 2016, NIPS.

[12]  Taesup Kim,et al.  Edge-Labeling Graph Neural Network for Few-Shot Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Jun Zhao,et al.  Recurrent Convolutional Neural Networks for Text Classification , 2015, AAAI.

[14]  Trevor Darrell,et al.  Caffe: Convolutional Architecture for Fast Feature Embedding , 2014, ACM Multimedia.

[15]  Yedid Hoshen,et al.  VAIN: Attentional Multi-agent Predictive Modeling , 2017, NIPS.

[16]  Julian J. McAuley,et al.  Ups and Downs: Modeling the Visual Evolution of Fashion Trends with One-Class Collaborative Filtering , 2016, WWW.

[17]  Tong Zhang,et al.  Deep Pyramid Convolutional Neural Networks for Text Categorization , 2017, ACL.

[18]  Regina Barzilay,et al.  Few-shot Text Classification with Distributional Signatures , 2019, ICLR.

[19]  Ah Chung Tsoi,et al.  The Graph Neural Network Model , 2009, IEEE Transactions on Neural Networks.

[20]  Geoffrey E. Hinton,et al.  Dynamic Routing Between Capsules , 2017, NIPS.

[21]  Eunho Yang,et al.  Learning to Propagate Labels: Transductive Propagation Network for Few-Shot Learning , 2018, ICLR.

[22]  Zhiyuan Liu,et al.  FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation , 2018, EMNLP.

[23]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[24]  Justin Salamon,et al.  Deep Convolutional Neural Networks and Data Augmentation for Environmental Sound Classification , 2016, IEEE Signal Processing Letters.

[25]  Shuicheng Yan,et al.  Semantic Object Parsing with Graph LSTM , 2016, ECCV.

[26]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[27]  Abhinav Gupta,et al.  Zero-Shot Recognition via Semantic Embeddings and Knowledge Graphs , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[28]  Maosong Sun,et al.  FewRel 2.0: Towards More Challenging Few-Shot Relation Classification , 2019, EMNLP.

[29]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[30]  F. Scarselli,et al.  A new model for learning in graph domains , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[31]  Trevor Darrell,et al.  DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.

[32]  Zhiyuan Liu,et al.  Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification , 2019, AAAI.

[33]  Khalil Sima'an,et al.  Graph Convolutional Encoders for Syntax-aware Neural Machine Translation , 2017, EMNLP.

[34]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[35]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[36]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[37]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[38]  Christopher D. Manning,et al.  Graph Convolution over Pruned Dependency Trees Improves Relation Extraction , 2018, EMNLP.