Multilingual and Cross-Lingual Graded Lexical Entailment

Grounded in cognitive linguistics, graded lexical entailment (GR-LE) is concerned with fine-grained assertions regarding the directional hierarchical relationships between concepts on a continuous scale. In this paper, we present the first work on cross-lingual generalisation of GR-LE relation. Starting from HyperLex, the only available GR-LE dataset in English, we construct new monolingual GR-LE datasets for three other languages, and combine those to create a set of six cross-lingual GR-LE datasets termed CL-HYPERLEX. We next present a novel method dubbed CLEAR (Cross-Lingual Lexical Entailment Attract-Repel) for effectively capturing graded (and binary) LE, both monolingually in different languages as well as across languages (i.e., on CL-HYPERLEX). Coupled with a bilingual dictionary, CLEAR leverages taxonomic LE knowledge in a resource-rich language (e.g., English) and propagates it to other languages. Supported by cross-lingual LE transfer, CLEAR sets competitive baseline performance on three new monolingual GR-LE datasets and six cross-lingual GR-LE datasets. In addition, we show that CLEAR outperforms current state-of-the-art on binary cross-lingual LE detection by a wide margin for diverse language pairs.

[1]  Mark W. Altom,et al.  Given versus induced category representations: use of prototype and exemplar information in classification. , 1984, Journal of experimental psychology. Learning, memory, and cognition.

[2]  Carina Silberer,et al.  Learning Grounded Meaning Representations with Autoencoders , 2014, ACL.

[3]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[4]  Anna Korhonen,et al.  Cross-Lingual Induction and Transfer of Verb Classes Based on Word Vector Space Specialisation , 2017, EMNLP.

[5]  Ngoc Thang Vu,et al.  Hierarchical Embeddings for Hypernymy Detection and Directionality , 2017, EMNLP.

[6]  Goran Glavas,et al.  Explicit Retrofitting of Distributional Word Vectors , 2018, ACL.

[7]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[8]  Ido Dagan,et al.  Directional distributional similarity for lexical inference , 2010, Natural Language Engineering.

[9]  Daniel Jurafsky,et al.  Robust Machine Translation Evaluation with Entailment Features , 2009, ACL.

[10]  James A. Hampton,et al.  Typicality, Graded Membership, and Vagueness , 2007, Cogn. Sci..

[11]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[12]  Jingwei Zhang,et al.  Word Semantic Representations using Bayesian Probabilistic Tensor Factorization , 2014, EMNLP.

[13]  Matteo Negri,et al.  Semeval-2013 Task 8: Cross-lingual Textual Entailment for Content Synchronization , 2013, *SEMEVAL.

[14]  Ido Dagan,et al.  Improving Hypernymy Detection with an Integrated Path-based and Distributional Method , 2016, ACL.

[15]  Marine Carpuat,et al.  Sparse Bilingual Word Representations for Cross-lingual Lexical Entailment , 2016, HLT-NAACL.

[16]  Allan Collins,et al.  Experiments on semantic memory and language comprehension. , 1972 .

[17]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[18]  Wanxiang Che,et al.  Learning Semantic Hierarchies via Word Embeddings , 2014, ACL.

[19]  Rada Mihalcea,et al.  SemEval-2010 Task 2: Cross-Lingual Lexical Substitution , 2009, SemEval@ACL.

[20]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[21]  H. Kamp,et al.  Prototype theory and compositionality , 1995, Cognition.

[22]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[23]  Felix Hill,et al.  SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.

[24]  Hiroshi Kanayama,et al.  Learning Crosslingual Word Embeddings without Bilingual Corpora , 2016, EMNLP.

[25]  Daniela Gerz,et al.  Scoring Lexical Entailment with a Supervised Directional Similarity Network , 2018, ACL.

[26]  Stephen Roller,et al.  Hearst Patterns Revisited: Automatic Hypernym Detection from Large Text Corpora , 2018, ACL.

[27]  Felix Hill,et al.  HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment , 2016, CL.

[28]  Goran Glavas,et al.  Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization , 2018, EMNLP.

[29]  Stephen Clark,et al.  Exploiting Image Generality for Lexical Entailment Detection , 2015, ACL.

[30]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[31]  Roberto Navigli,et al.  A Framework for the Construction of Monolingual and Cross-lingual Word Similarity Datasets , 2015, ACL.

[32]  Ivan Vulic Cross-Lingual Syntactically Informed Distributed Word Representations , 2017, EACL.

[33]  Graham Neubig,et al.  Cross-Lingual Word Embeddings for Low-Resource Language Modeling , 2017, EACL.

[34]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[35]  Douwe Kiela,et al.  Poincaré Embeddings for Learning Hierarchical Representations , 2017, NIPS.

[36]  Goran Glavas,et al.  Specializing Distributional Vectors of All Words for Lexical Entailment , 2019, RepL4NLP@ACL.

[37]  Dan Roth,et al.  Robust Cross-lingual Hypernymy Detection using Dependency Context , 2018, NAACL-HLT.

[38]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[39]  Nikola Ljubesic,et al.  {bs,hr,sr}WaC - Web Corpora of Bosnian, Croatian and Serbian , 2014, WaC@EACL.

[40]  Ivan Vulić,et al.  Specialising Word Vectors for Lexical Entailment , 2017, NAACL.

[41]  Samuel L. Smith,et al.  Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.

[42]  Goran Glavas,et al.  How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions , 2019, ACL.

[43]  Nigel Collier,et al.  SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity , 2017, *SEMEVAL.

[44]  Guillaume Lample,et al.  XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.

[45]  E. Rosch Cognitive Representations of Semantic Categories. , 1975 .

[46]  Thomas Hofmann,et al.  Hyperbolic Entailment Cones for Learning Hierarchical Embeddings , 2018, ICML.

[47]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[48]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[49]  Andrey Kutuzov,et al.  Texts in, meaning out: neural language models in semantic similarity task for Russian , 2015, ArXiv.

[50]  Catherine Havasi,et al.  ConceptNet 5.5: An Open Multilingual Graph of General Knowledge , 2016, AAAI.

[51]  Steven Skiena,et al.  Polyglot: Distributed Word Representations for Multilingual NLP , 2013, CoNLL.

[52]  Ann Bies,et al.  Cross-Document, Cross-Language Event Coreference Annotation Using Event Hoppers , 2018, LREC.

[53]  Steve Young,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[54]  Saif Mohammad,et al.  Experiments with three approaches to recognizing lexical entailment , 2014, Natural Language Engineering.

[55]  Dominik Schlechtweg,et al.  Hypernyms under Siege: Linguistically-motivated Artillery for Hypernymy Detection , 2016, EACL.

[56]  Makoto Miwa,et al.  Word Embedding-based Antonym Detection using Thesauri and Distributional Information , 2015, NAACL.

[57]  Jonathan Pool,et al.  PanLex: Building a Resource for Panlingual Lexical Translation , 2014, LREC.

[58]  Anna Korhonen,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[59]  Qin Lu,et al.  Chasing Hypernyms in Vector Spaces with Entropy , 2014, EACL.

[60]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[61]  Philipp Cimiano,et al.  Representing Multilingual Data as Linked Data: the Case of BabelNet 2.0 , 2014, LREC.

[62]  Goran Glavas,et al.  Generalized Tuning of Distributional Word Vectors for Monolingual and Cross-Lingual Lexical Entailment , 2019, ACL.

[63]  Goran Glavas,et al.  Dual Tensor Model for Detecting Asymmetric Lexico-Semantic Relations , 2017, EMNLP.

[64]  Ido Dagan,et al.  The Distributional Inclusion Hypotheses and Lexical Entailment , 2005, ACL.

[65]  Siddharth Patwardhan,et al.  The Role of Context Types and Dimensionality in Learning Word Embeddings , 2016, NAACL.

[66]  David J. Weir,et al.  Learning to Distinguish Hypernyms and Co-Hyponyms , 2014, COLING.

[67]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[68]  Nigel Collier,et al.  Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models , 2018, EMNLP 2018.

[69]  Roi Reichart,et al.  Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling , 2015 .

[70]  Francis Bond,et al.  Linking and Extending an Open Multilingual Wordnet , 2013, ACL.