Meta-Information Guided Meta-Learning for Few-Shot Relation Classification

Few-shot classification requires classifiers to adapt to new classes with only a few training instances. State-of-the-art meta-learning approaches such as MAML learn how to initialize and fast adapt parameters from limited instances, which have shown promising results in few-shot classification. However, existing meta-learning models solely rely on implicit instance-based statistics, and thus suffer from instance unreliability and weak interpretability. To solve this problem, we propose a novel meta-information guided meta-learning (MIML) framework, where semantic concepts of classes provide strong guidance for meta-learning in both initialization and adaptation. In effect, our model can establish connections between instance-based information and semantic-based information, which enables more effective initialization and faster adaptation. Comprehensive experimental results on few-shot relation classification demonstrate the effectiveness of the proposed framework. Notably, MIML achieves comparable or superior performance to humans with only one shot on FewRel evaluation. The source code and experiment details of this paper can be obtained from https://github.com/thunlp/MIML.

[1]  Samy Bengio,et al.  Zero-Shot Learning by Convex Combination of Semantic Embeddings , 2013, ICLR.

[2]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[3]  Joshua B. Tenenbaum,et al.  One shot learning of simple visual concepts , 2011, CogSci.

[4]  Sergey Levine,et al.  Meta-Learning with Implicit Gradients , 2019, NeurIPS.

[5]  Yu-Chiang Frank Wang,et al.  A Closer Look at Few-shot Classification , 2019, ICLR.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Katja Hofmann,et al.  Fast Context Adaptation via Meta-Learning , 2018, ICML.

[8]  J. Edward Jackson,et al.  A User's Guide to Principal Components. , 1991 .

[9]  Regina Barzilay,et al.  Few-shot Text Classification with Distributional Signatures , 2019, ICLR.

[10]  Andrew M. Dai,et al.  Adversarial Training Methods for Semi-Supervised Text Classification , 2016, ICLR.

[11]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[12]  Zhiyuan Liu,et al.  Hybrid Attention-Based Prototypical Networks for Noisy Few-Shot Relation Classification , 2019, AAAI.

[13]  Razvan Pascanu,et al.  Meta-Learning with Latent Embedding Optimization , 2018, ICLR.

[14]  Ying Wei,et al.  Hierarchically Structured Meta-learning , 2019, ICML.

[15]  Hugo Larochelle,et al.  Optimization as a Model for Few-Shot Learning , 2016, ICLR.

[16]  Oriol Vinyals,et al.  Matching Networks for One Shot Learning , 2016, NIPS.

[17]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[18]  Richard S. Zemel,et al.  Prototypical Networks for Few-shot Learning , 2017, NIPS.

[19]  Pieter Abbeel,et al.  A Simple Neural Attentive Meta-Learner , 2017, ICLR.

[20]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[21]  Zhen-Hua Ling,et al.  Multi-Level Matching and Aggregation Network for Few-Shot Relation Classification , 2019, ACL.

[22]  Andrew Y. Ng,et al.  Zero-Shot Learning Through Cross-Modal Transfer , 2013, NIPS.

[23]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[24]  Hang Li,et al.  Meta-SGD: Learning to Learn Quickly for Few Shot Learning , 2017, ArXiv.

[25]  Marc'Aurelio Ranzato,et al.  DeViSE: A Deep Visual-Semantic Embedding Model , 2013, NIPS.

[26]  Zhiyuan Liu,et al.  FewRel: A Large-Scale Supervised Few-Shot Relation Classification Dataset with State-of-the-Art Evaluation , 2018, EMNLP.

[27]  Luca Bertinetto,et al.  Meta-learning with differentiable closed-form solvers , 2018, ICLR.

[28]  Markus Krötzsch,et al.  Wikidata , 2014, Commun. ACM.

[29]  Hong Yu,et al.  Meta Networks , 2017, ICML.

[30]  Percy Liang,et al.  Understanding Black-box Predictions via Influence Functions , 2017, ICML.

[31]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[32]  Jeffrey Ling,et al.  Matching the Blanks: Distributional Similarity for Relation Learning , 2019, ACL.

[33]  Gregory R. Koch,et al.  Siamese Neural Networks for One-Shot Image Recognition , 2015 .

[34]  Jun Zhao,et al.  Knowledge Guided Metric Learning for Few-Shot Text Classification , 2020, NAACL.

[35]  Yike Guo,et al.  Integrating Semantic Knowledge to Tackle Zero-shot Text Classification , 2019, NAACL.

[36]  Tao Xiang,et al.  Learning to Compare: Relation Network for Few-Shot Learning , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[37]  Sergey Levine,et al.  Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.

[38]  Joan Bruna,et al.  Few-Shot Learning with Graph Neural Networks , 2017, ICLR.

[39]  Maosong Sun,et al.  FewRel 2.0: Towards More Challenging Few-Shot Relation Classification , 2019, EMNLP.

[40]  Daniel Jurafsky,et al.  Distant supervision for relation extraction without labeled data , 2009, ACL.