Hypernymy Detection for Low-Resource Languages via Meta Learning

Hypernymy detection, a.k.a, lexical entailment, is a fundamental sub-task of many natural language understanding tasks. Previous explorations mostly focus on monolingual hypernymy detection on high-resource languages, e.g., English, but few investigate the low-resource scenarios. This paper addresses the problem of low-resource hypernymy detection by combining high-resource languages. We extensively compare three joint training paradigms and for the first time propose applying meta learning to relieve the low-resource issue. Experiments demonstrate the superiority of our method among the three settings, which substantially improves the performance of extremely low-resource languages by preventing over-fitting on small datasets.

[1]  Paul Buitelaar,et al.  SemEval-2016 Task 13: Taxonomy Extraction Evaluation (TExEval-2) , 2016, *SEMEVAL.

[2]  Graham Neubig,et al.  Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces , 2019, ACL.

[3]  Marine Carpuat,et al.  Sparse Bilingual Word Representations for Cross-lingual Lexical Entailment , 2016, HLT-NAACL.

[4]  Aoying Zhou,et al.  A Family of Fuzzy Orthogonal Projection Models for Monolingual and Cross-lingual Hypernymy Prediction , 2019, WWW.

[5]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[6]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[7]  Goran Glavas,et al.  Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment , 2019, ACL.

[8]  Daniela Gerz,et al.  Scoring Lexical Entailment with a Supervised Directional Similarity Network , 2018, ACL.

[9]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[10]  Gemma Boleda,et al.  Inclusive yet Selective: Supervised Distributional Hypernymy Detection , 2014, COLING.

[11]  Sebastian Riedel,et al.  Language Models as Knowledge Bases? , 2019, EMNLP.

[12]  Haixun Wang,et al.  Learning Term Embeddings for Hypernymy Identification , 2015, IJCAI.

[13]  Ian Maddieson,et al.  On the universal structure of human lexical semantics , 2015, Proceedings of the National Academy of Sciences.

[14]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[15]  Francis Bond,et al.  Linking and Extending an Open Multilingual Wordnet , 2013, ACL.

[16]  Goran Glavas,et al.  Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model , 2018, NAACL.

[17]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[18]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[19]  Zi-Yi Dou,et al.  Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks , 2019, EMNLP.

[20]  Goran Glavas,et al.  Multilingual and Cross-Lingual Graded Lexical Entailment , 2019, ACL.

[21]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[22]  Stefano Faralli,et al.  A Large DataBase of Hypernymy Relations Extracted from the Web , 2016, LREC.

[23]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[24]  Po-Sen Huang,et al.  Natural Language to Structured Query Generation via Meta-Learning , 2018, NAACL.

[25]  Yejin Choi,et al.  COMET: Commonsense Transformers for Automatic Knowledge Graph Construction , 2019, ACL.

[26]  Pascale Fung,et al.  Personalizing Dialogue Agents via Meta-Learning , 2019, ACL.

[27]  Peng Xu,et al.  Meta-Transfer Learning for Code-Switched Speech Recognition , 2020, ACL.

[28]  Anders Sogaard,et al.  Are All Good Word Vector Spaces Isomorphic? , 2020, EMNLP.

[29]  Andreas Vlachos,et al.  Model-Agnostic Meta-Learning for Relation Classification with Limited Supervision , 2019, ACL.

[30]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[31]  Dan Roth,et al.  Robust Cross-lingual Hypernymy Detection using Dependency Context , 2018, NAACL-HLT.

[32]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[33]  Kawin Ethayarajh,et al.  How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.