KEML: A Knowledge-Enriched Meta-Learning Framework for Lexical Relation Classification

Lexical relations describe how concepts are semantically related, in the form of relation triples. The accurate prediction of lexical relations between concepts is challenging, due to the sparsity of patterns indicating the existence of such relations. We propose the Knowledge-Enriched Meta-Learning (KEML) framework to address the task of lexical relation classification. In KEML, the LKB-BERT (Lexical Knowledge Base-BERT) model is presented to learn concept representations from massive text corpora, with rich lexical knowledge injected by distant supervision. A probabilistic distribution of auxiliary tasks is defined to increase the model's ability to recognize different types of lexical relations. We further combine a meta-learning process over the auxiliary task distribution and supervised learning to train the neural lexical relation classifier. Experiments over multiple datasets show that KEML outperforms state-of-the-art methods.

[1]  Steven Schockaert,et al.  Relational Word Embeddings , 2019, ACL.

[2]  David J. Weir,et al.  Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.

[3]  Joaquin Vanschoren,et al.  Meta-Learning: A Survey , 2018, Automated Machine Learning.

[4]  Anna Korhonen,et al.  Cross-lingual Semantic Specialization via Lexical Relation Induction , 2019, EMNLP.

[5]  Aoying Zhou,et al.  SphereRE: Distinguishing Lexical Relations with Hyperspherical Relation Embeddings , 2019, ACL.

[6]  Zhendong Mao,et al.  Knowledge Graph Embedding: A Survey of Approaches and Applications , 2017, IEEE Transactions on Knowledge and Data Engineering.

[7]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[8]  Haixun Wang,et al.  Probase: a probabilistic taxonomy for text understanding , 2012, SIGMOD Conference.

[9]  Yiming Yang,et al.  XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[10]  Mohammad Taher Pilehvar,et al.  SemEval-2016 Task 14: Semantic Taxonomy Enrichment , 2016, *SEMEVAL.

[11]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12]  Carlos Ramisch,et al.  Predicting the Compositionality of Nominal Compounds: Giving Word Embeddings a Hard Time , 2016, ACL.

[13]  Zi-Yi Dou,et al.  Investigating Meta-Learning Algorithms for Low-Resource Natural Language Understanding Tasks , 2019, EMNLP.

[14]  Wanxiang Che,et al.  Learning Semantic Hierarchies via Word Embeddings , 2014, ACL.

[15]  Stephen Roller,et al.  Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora , 2018, ACL.

[16]  Tsuneaki Kato,et al.  Filling Missing Paths: Modeling Co-occurrences of Word Pairs and Dependency Paths for Recognizing Lexical Semantic Relations , 2018, NAACL.

[17]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[18]  Raffaella Bernardi,et al.  Entailment above the word level in distributional semantics , 2012, EACL.

[19]  Ngoc Thang Vu,et al.  Distinguishing Antonyms and Synonyms in a Pattern-based Neural Network , 2017, EACL.

[20]  Pascale Fung,et al.  Personalizing Dialogue Agents via Meta-Learning , 2019, ACL.

[21]  Goran Glavas,et al.  Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model , 2018, NAACL.

[22]  Joshua Achiam,et al.  On First-Order Meta-Learning Algorithms , 2018, ArXiv.

[23]  Núria Bel,et al.  Reading Between the Lines: Overcoming Data Sparsity for Accurate Classification of Lexical Relationships , 2015, *SEM@NAACL-HLT.

[24]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[25]  Omer Levy,et al.  Do Supervised Distributional Methods Really Learn Lexical Inference Relations? , 2015, NAACL.

[26]  Chengsheng Mao,et al.  KG-BERT: BERT for Knowledge Graph Completion , 2019, ArXiv.

[27]  Qiang Chen,et al.  Meta Relational Learning for Few-Shot Link Prediction in Knowledge Graphs , 2019, EMNLP-IJCNLP 2019.

[28]  Aoying Zhou,et al.  A Short Survey on Taxonomy Learning from Text Corpora: Issues, Resources and Recent Advances , 2017, EMNLP.

[29]  Yiming Yang,et al.  Transformer-XL: Attentive Language Models beyond a Fixed-Length Context , 2019, ACL.

[30]  Heng Ji,et al.  Heterogeneous Supervision for Relation Extraction: A Representation Learning Approach , 2017, EMNLP.

[31]  Marti A. Hearst Automatic Acquisition of Hyponyms from Large Text Corpora , 1992, COLING.

[32]  Ido Dagan,et al.  Improving Hypernymy Detection with an Integrated Path-based and Distributional Method , 2016, ACL.

[33]  Matt Le,et al.  Inferring Concept Hierarchies from Text Corpora via Hyperbolic Embeddings , 2019, ACL.

[34]  Zhe Zhao,et al.  K-BERT: Enabling Language Representation with Knowledge Graph , 2019, AAAI.

[35]  Aoying Zhou,et al.  Predicting hypernym–hyponym relations for Chinese taxonomy learning , 2018, Knowledge and Information Systems.

[36]  Omer Levy,et al.  pair2vec: Compositional Word-Pair Embeddings for Cross-Sentence Inference , 2018, NAACL.

[37]  Alessandro Lenci,et al.  How we BLESSed distributional semantic evaluation , 2011, GEMS.

[38]  Ido Dagan,et al.  Path-based vs. Distributional Information in Recognizing Lexical Semantic Relations , 2016, CogALex@COLING.

[39]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[40]  Xipeng Qiu,et al.  Pre-trained models for natural language processing: A survey , 2020, Science China Technological Sciences.

[41]  Maosong Sun,et al.  ERNIE: Enhanced Language Representation with Informative Entities , 2019, ACL.

[42]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[43]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[44]  Chu-Ren Huang,et al.  Nine Features in a Random Forest to Learn Taxonomical Semantic Relations , 2016, LREC.

[45]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[46]  Ekaterina Vylomova,et al.  Take and Took, Gaggle and Goose, Book and Read: Evaluating the Utility of Vector Differences for Lexical Relation Learning , 2015, ACL.

[47]  Laura Kallmeyer,et al.  CogALex-V Shared Task: GHHH - Detecting Semantic Relations via Word Embeddings , 2016, CogALex@COLING.

[48]  Ngoc Thang Vu,et al.  Integrating Distributional Lexical Contrast into Word Embeddings for Antonym-Synonym Distinction , 2016, ACL.

[49]  Steven Schockaert,et al.  Collocation Classification with Unsupervised Relation Vectors , 2019, ACL.

[50]  Ming Zhou,et al.  Coupling Retrieval and Meta-Learning for Context-Dependent Semantic Parsing , 2019, ACL.

[51]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[52]  Stefan Evert,et al.  The CogALex-V Shared Task on the Corpus-Based Identification of Semantic Relations , 2016, CogALex@COLING.

[53]  Wenhan Xiong,et al.  Pretrained Encyclopedia: Weakly Supervised Knowledge-Pretrained Language Model , 2019, ICLR.

[54]  Katrin Erk,et al.  Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment , 2016, EMNLP.

[55]  Steven Schockaert,et al.  A Latent Variable Model for Learning Distributional Relation Vectors , 2019, IJCAI.

[56]  Anna Korhonen,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[57]  Yifan He,et al.  Enriching Pre-trained Language Model with Entity Information for Relation Classification , 2019, CIKM.

[58]  Chu-Ren Huang,et al.  EVALution 1.0: an Evolving Semantic Dataset for Training and Evaluation of Distributional Semantic Models , 2015, LDL@IJCNLP.

[59]  Lei Zou,et al.  Efficiently Answering Technical Questions - A Knowledge Graph Approach , 2017, AAAI.

[60]  Huda Khayrallah,et al.  HABLex: Human Annotated Bilingual Lexicons for Experiments in Machine Translation , 2019, EMNLP.

[61]  Tsuneaki Kato,et al.  Neural Latent Relational Analysis to Capture Lexical Semantic Relations in a Vector Space , 2018, EMNLP.

[62]  Sergey Levine,et al.  Probabilistic Model-Agnostic Meta-Learning , 2018, NeurIPS.