Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity

We introduce Multi-SimLex, a large-scale lexical resource and evaluation benchmark covering data sets for 12 typologically diverse languages, including major languages (e.g., Mandarin Chinese, Spanish, Russian) as well as less-resourced ones (e.g., Welsh, Kiswahili). Each language data set is annotated for the lexical relation of semantic similarity and contains 1,888 semantically aligned concept pairs, providing a representative coverage of word classes (nouns, verbs, adjectives, adverbs), frequency ranks, similarity intervals, lexical fields, and concreteness levels. Additionally, owing to the alignment of concepts across languages, we provide a suite of 66 crosslingual semantic similarity data sets. Because of its extensive size and language coverage, Multi-SimLex provides entirely novel opportunities for experimental evaluation and analysis. On its monolingual and crosslingual benchmarks, we evaluate and analyze a wide array of recent state-of-the-art monolingual and crosslingual representation models, including static and contextualized word embeddings (such as fastText, monolingual and multilingual BERT, XLM), externally informed lexical representations, as well as fully unsupervised and (weakly) supervised crosslingual word embeddings. We also present a step-by-step data set creation protocol for creating consistent, Multi-Simlex -style resources for additional languages.We make these contributions—the public release of Multi-SimLex data sets, their creation protocol, strong baseline results, and in-depth analyses which can be be helpful in guiding future developments in multilingual lexical semantics and representation learning—available via aWeb site that will encourage community effort in further expansion of Multi-Simlex to many more languages. Such a large-scale semantic resource could inspire significant further advances in NLP across languages.

[1]  David Evans,et al.  Tracking and summarizing news on a daily basis with Columbia's Newsblaster , 2002 .

[2]  Robert Forkel,et al.  The World Atlas of Language Structures Online , 2009 .

[3]  Ivan Vulic,et al.  Unsupervised Cross-Lingual Representation Learning , 2019, ACL.

[4]  Regina Barzilay,et al.  Climbing the Tower of Babel: Unsupervised Multilingual Learning , 2010, ICML.

[5]  Goran Glavas,et al.  Semantic Specialization of Distributional Word Vectors , 2019, EMNLP/IJCNLP.

[6]  Roy Schwartz,et al.  Symmetric Patterns and Coordinations: Fast and Enhanced Representations of Verbs and Adjectives , 2016, HLT-NAACL.

[7]  Dominik Schlechtweg,et al.  A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains , 2019, ACL.

[8]  Anna Korhonen,et al.  Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP , 2018, ACL.

[9]  Jonas Ardö,et al.  The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data , 2020, Scientific Data.

[10]  Mikel Artetxe,et al.  On the Cross-lingual Transferability of Monolingual Representations , 2019, ACL.

[11]  Z. Harris,et al.  Methods in structural linguistics. , 1952 .

[12]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13]  Miriam Van Staden,et al.  The semantic categories of cutting and breaking events: A crosslinguistic perspective , 2007 .

[14]  Jouko Vankka,et al.  Finnish resources for evaluating language model semantics , 2017, NODALIDA.

[15]  Hervé Jégou,et al.  Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion , 2018, EMNLP.

[16]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[17]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[18]  Anders Søgaard,et al.  A Survey of Cross-lingual Word Embedding Models , 2017, J. Artif. Intell. Res..

[19]  Dorothy Holland,et al.  Culture and cognition , 1987 .

[20]  Alessandro Lenci,et al.  Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.

[21]  Mark Dredze,et al.  Are All Languages Created Equal in Multilingual BERT? , 2020, REPL4NLP.

[22]  Samuel L. Smith,et al.  Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.

[23]  Zornitsa Kozareva,et al.  SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.

[24]  Maosong Sun,et al.  COS960: A Chinese Word Similarity Dataset of 960 Word Pairs , 2019, ArXiv.

[25]  Goran Glavas,et al.  How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions , 2019, ACL.

[26]  Patrick Littell,et al.  URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors , 2017, EACL.

[27]  Manaal Faruqui,et al.  Cross-lingual Models of Word Embeddings: An Empirical Comparison , 2016, ACL.

[28]  Ken-ichi Kawarabayashi,et al.  Are Girls Neko or Shōjo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization , 2019, ACL.

[29]  Monojit Choudhury,et al.  The State and Fate of Linguistic Diversity and Inclusion in the NLP World , 2020, ACL.

[30]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[31]  Eneko Agirre,et al.  Uncovering Divergent Linguistic Information in Word Embeddings with Lessons for Intrinsic and Extrinsic Evaluation , 2018, CoNLL.

[32]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[33]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[34]  Beth Levin,et al.  English Verb Classes and Alternations: A Preliminary Investigation , 1993 .

[35]  Steve Young,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[36]  Gert Storms,et al.  Word associations: Network and semantic properties , 2008, Behavior research methods.

[37]  Isabelle Augenstein,et al.  From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings , 2018, NAACL-HLT.

[38]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[39]  Siddharth Patwardhan,et al.  The Role of Context Types and Dimensionality in Learning Word Embeddings , 2016, NAACL.

[40]  Martin Potthast,et al.  CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.

[41]  Roy Schwartz,et al.  Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction , 2015, CoNLL.

[42]  Guillaume Lample,et al.  Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning , 2016, NAACL.

[43]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[44]  Martha Palmer,et al.  Verb Semantics and Lexical Selection , 1994, ACL.

[45]  Adam Lopez,et al.  From Characters to Words to in Between: Do We Capture Morphology? , 2017, ACL.

[46]  Veselin Stoyanov,et al.  Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.

[47]  Graeme Hirst,et al.  Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.

[48]  André Freitas,et al.  SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages , 2018, LREC.

[49]  Shafiq R. Joty,et al.  Revisiting Adversarial Autoencoder for Unsupervised Word Translation with Cycle Consistency and Improved Training , 2019, NAACL.

[50]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[51]  Goran Glavas,et al.  Probing Pretrained Language Models for Lexical Semantics , 2020, EMNLP.

[52]  Felix Hill,et al.  SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.

[53]  J. Trier Der deutsche Wortschatz im Sinnbezirk des Verstandes : von den Anfängen bis zum Beginn des 13. Jahrhunderts , 1973 .

[54]  Martine Vanhove,et al.  From Polysemy to Semantic Change: Towards a Typology of Lexical Semantic Associations , 2008 .

[55]  Susanne Vejdemo,et al.  Lexical change often begins and ends in semantic peripheries , 2018, Pragmatics and Cognition.

[56]  Mohammad Sadegh Rasooli,et al.  Cross-Lingual Syntactic Transfer with Limited Resources , 2017, Transactions of the Association for Computational Linguistics.

[57]  Tommi S. Jaakkola,et al.  Gromov-Wasserstein Alignment of Word Embedding Spaces , 2018, EMNLP.

[58]  Ryan Cotterell,et al.  Towards Zero-shot Language Modeling , 2019, EMNLP.

[59]  Danqi Chen,et al.  A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.

[60]  Fan Yang,et al.  XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation , 2020, EMNLP.

[61]  Lu Chen,et al.  Towards Universal Dialogue State Tracking , 2018, EMNLP.

[62]  Agnieszka Mykowiecka,et al.  SimLex-999 for Polish , 2018, LREC.

[63]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[64]  Thomas A. Schreiber,et al.  The University of South Florida free association, rhyme, and word fragment norms , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.

[65]  R. Bro,et al.  Centering and scaling in component analysis , 2003 .

[66]  Jason Baldridge,et al.  PAWS: Paraphrase Adversaries from Word Scrambling , 2019, NAACL.

[67]  Eric Fosler-Lussier,et al.  Adjusting Word Embeddings with Semantic Intensity Orders , 2016, Rep4NLP@ACL.

[68]  Kawin Ethayarajh,et al.  How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.

[69]  Kevin Gimpel,et al.  From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[70]  Nigel Collier,et al.  Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models , 2018, EMNLP 2018.

[71]  Roi Reichart,et al.  Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling , 2015 .

[72]  R'emi Louf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[73]  Ponrudee Netisopakul,et al.  Word Similarity Datasets for Thai: Construction and Evaluation , 2019, IEEE Access.

[74]  Goran Glavas,et al.  Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization , 2018, EMNLP.

[75]  Steven Schockaert,et al.  Improving Cross-Lingual Word Embeddings by Meeting in the Middle , 2018, EMNLP.

[76]  Martha Palmer,et al.  Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .

[77]  Xinying Chen,et al.  Classifying Languages by Dependency Structure. Typologies of Delexicalized Universal Dependency Treebanks , 2017, DepLing.

[78]  Anna Korhonen,et al.  A Systematic Study of Leveraging Subword Information for Learning Word Representations , 2019, NAACL.

[79]  Jason Baldridge,et al.  PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification , 2019, EMNLP.

[80]  J. Trier Der deutsche Wortschatz im Sinnbezirk des Verstandes : die Geschichte eines Sprachlichen Feldes , 1931 .

[81]  Silvia Bernardini,et al.  The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.

[82]  Stephen Clark,et al.  A Systematic Study of Semantic Vector Space Model Parameters , 2014, CVSC@EACL.

[83]  John B. Lowe,et al.  The Berkeley FrameNet Project , 1998, ACL.

[84]  Roy Schwartz,et al.  Automatic Selection of Context Configurations for Improved Class-Specific Word Representations , 2016, CoNLL.

[85]  Graham Neubig,et al.  XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.

[86]  Eneko Agirre,et al.  Translation Artifacts in Cross-lingual Transfer Learning , 2020, EMNLP.

[87]  Anna Korhonen,et al.  On the Role of Seed Lexicons in Learning Bilingual Word Embeddings , 2016, ACL.

[88]  Laure Thompson,et al.  The strange geometry of skip-gram with negative sampling , 2017, EMNLP.

[89]  Eunsol Choi,et al.  TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages , 2020, Transactions of the Association for Computational Linguistics.

[90]  Célia Michotey,et al.  The GenTree Dendroecological Collection, tree-ring and wood density data from seven tree species across Europe , 2020, Scientific Data.

[91]  Noah A. Smith,et al.  Many Languages, One Parser , 2016, TACL.

[92]  Goran Glavas,et al.  Specializing Distributional Vectors of All Words for Lexical Entailment , 2019, RepL4NLP@ACL.

[93]  Karl Pearson F.R.S. LIII. On lines and planes of closest fit to systems of points in space , 1901 .

[94]  Roi Reichart,et al.  Deep Contextualized Self-training for Low Resource Dependency Parsing , 2019, Transactions of the Association for Computational Linguistics.

[95]  Mona T. Diab,et al.  Context-Aware Cross-Lingual Mapping , 2019, NAACL.

[96]  Goran Glavas,et al.  Explicit Retrofitting of Distributional Word Vectors , 2018, ACL.

[97]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[98]  Goran Glavas,et al.  Informing Unsupervised Pretraining with External Linguistic Knowledge , 2019, ArXiv.

[99]  Anna Korhonen,et al.  An Unsupervised Model for Instance Level Subcategorization Acquisition , 2014, EMNLP.

[100]  Zellig S. Harris,et al.  Methods in structural linguistics. , 1952 .

[101]  R. A. van den Berg,et al.  Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.

[102]  Anna Korhonen,et al.  Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation , 2017, EACL.

[103]  Veselin Stoyanov,et al.  Emerging Cross-lingual Structure in Pretrained Language Models , 2020, ACL.

[104]  Dan Klein,et al.  Multilingual Alignment of Contextual Word Representations , 2020, ICLR.

[105]  Jason Eisner,et al.  Lexical Semantics , 2020, The Handbook of English Linguistics.

[106]  Roberto Navigli,et al.  BabelDomains: Large-Scale Domain Labeling of Lexical Resources , 2017, EACL.

[107]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[108]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[109]  Goran Glavas,et al.  Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? , 2019, EMNLP.

[110]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[111]  Mamoru Komachi,et al.  Construction of a Japanese Word Similarity Dataset , 2017, LREC.

[112]  J. R. Firth,et al.  A Synopsis of Linguistic Theory, 1930-1955 , 1957 .

[113]  Eva Schlinger,et al.  How Multilingual is Multilingual BERT? , 2019, ACL.

[114]  Georgiana Dinu,et al.  Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning , 2015, ACL.

[115]  Goran Glavas,et al.  Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model , 2018, NAACL.

[116]  Sebastian Riedel,et al.  MLQA: Evaluating Cross-lingual Extractive Question Answering , 2019, ACL.

[117]  Eneko Agirre,et al.  Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[118]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[119]  Goran Glavas,et al.  Multilingual and Cross-Lingual Graded Lexical Entailment , 2019, ACL.

[120]  Kevin Gimpel,et al.  ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.

[121]  Thierry Poibeau,et al.  Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing , 2018, Computational Linguistics.

[122]  Neville Ryant,et al.  A large-scale classification of English verbs , 2008, Lang. Resour. Evaluation.

[123]  Stephen Clark,et al.  Specializing Word Embeddings for Similarity or Relatedness , 2015, EMNLP.

[124]  Alexandre François,et al.  Semantic maps and the typology of colexification: Intertwining polysemous networks across languages , 2008 .

[125]  Olcay Taner Yildiz,et al.  AnlamVer: Semantic Model Evaluation Dataset for Turkish - Word Similarity and Relatedness , 2018, COLING.

[126]  Graham Neubig,et al.  Learning Language Representations for Typology Prediction , 2017, EMNLP.

[127]  Michael Meeuwis,et al.  Order of subject, object, and verb , 2013 .

[128]  Jörg Tiedemann,et al.  What Do Language Representations Really Represent? , 2019, Computational Linguistics.

[129]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[130]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[131]  Nigel Collier,et al.  SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity , 2017, *SEMEVAL.

[132]  Yoshua Bengio,et al.  Learning to Understand Phrases by Embedding the Dictionary , 2015, TACL.

[133]  Jörg Tiedemann,et al.  Continuous multilinguality with language vectors , 2016, EACL.

[134]  Graham Neubig,et al.  Cross-Lingual Word Embeddings for Low-Resource Language Modeling , 2017, EACL.

[135]  Tamir Hazan,et al.  Perturbation Based Learning for Structured NLP Tasks with Application to Dependency Parsing , 2019, Transactions of the Association for Computational Linguistics.

[136]  Yi Zhu,et al.  On the Importance of Subword Information for Morphological Tasks in Truly Low-Resource Languages , 2019, CoNLL.

[137]  Anna Korhonen,et al.  Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity , 2019, COLING.

[138]  Tiago Tresoldi,et al.  The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies , 2020, Scientific Data.

[139]  Eneko Agirre,et al.  Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations , 2018, AAAI.

[140]  Michael Meeuwis,et al.  'Green' and 'blue' , 2013 .

[141]  Sebastian Ruder,et al.  A survey of cross-lingual embedding models , 2017, ArXiv.

[142]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[143]  Steven Schockaert,et al.  On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning , 2020, LREC.

[144]  N. Mantel The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.

[145]  Guillaume Lample,et al.  XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.

[146]  Marco Saerens,et al.  Centering Similarity Measures to Reduce Hubs , 2013, EMNLP.

[147]  Pramod Viswanath,et al.  All-but-the-Top: Simple and Effective Postprocessing for Word Representations , 2017, ICLR.

[148]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[149]  Mark Dredze,et al.  Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT , 2019, EMNLP.

[150]  Qianchu Liu,et al.  Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation , 2019, CoNLL.

[151]  Lior Wolf,et al.  Non-Adversarial Unsupervised Word Translation , 2018, EMNLP.

[152]  Claire Cardie,et al.  Unsupervised Multilingual Word Embeddings , 2018, EMNLP.

[153]  Graham Neubig,et al.  Choosing Transfer Languages for Cross-Lingual Learning , 2019, ACL.

[154]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[155]  Eneko Agirre,et al.  A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.

[156]  Dong Wang,et al.  Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.

[157]  Felix Hill,et al.  HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment , 2016, CL.

[158]  Goran Glavas,et al.  From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers , 2020, ArXiv.

[159]  Qianchu Liu,et al.  XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning , 2020, EMNLP.

[160]  Tapio Salakoski,et al.  Multilingual is not enough: BERT for Finnish , 2019, ArXiv.

[161]  Mike Schuster,et al.  Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[162]  Anders Søgaard,et al.  On the Limitations of Unsupervised Bilingual Dictionary Induction , 2018, ACL.

[163]  Peter D. Turney Domain and Function: A Dual-Space Model of Semantic Relations and Compositions , 2012, J. Artif. Intell. Res..

[164]  Roberto Navigli,et al.  A Framework for the Construction of Monolingual and Cross-lingual Word Similarity Datasets , 2015, ACL.

[165]  Jeffrey S. Gruber,et al.  Lexical structures in syntax and semantics , 1976 .

[166]  David Vandyke,et al.  Counter-fitting Word Vectors to Linguistic Constraints , 2016, NAACL.

[167]  Martha Palmer,et al.  Extending a Verb-lexicon Using a Semantically Annotated Corpus , 2004, LREC.

[168]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.

[169]  Anna Korhonen,et al.  Cross-lingual Semantic Specialization via Lexical Relation Induction , 2019, EMNLP.

[170]  M. Lucas,et al.  Semantic priming without association: A meta-analytic review , 2000, Psychonomic bulletin & review.

[171]  Benjamin Lecouteux,et al.  FlauBERT: Unsupervised Language Model Pre-training for French , 2020, LREC.

[172]  Gökhan Tür,et al.  Intent detection using semantically enriched word embeddings , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[173]  Jonathan Pool,et al.  PanLex: Building a Resource for Panlingual Lexical Translation , 2014, LREC.

[174]  Magnus Sahlgren,et al.  The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .

[175]  Anna Korhonen,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[176]  Virginia R. de Sa,et al.  An Empirical Study on Post-processing Methods for Word Embeddings , 2019, ArXiv.

[177]  Dan Roth,et al.  Cross-Lingual Ability of Multilingual BERT: An Empirical Study , 2019, ICLR.

[178]  Steffen Staab,et al.  Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..

[179]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[180]  Omer Levy,et al.  Dependency-Based Word Embeddings , 2014, ACL.

[181]  Daniel Kondratyuk,et al.  75 Languages, 1 Model: Parsing Universal Dependencies Universally , 2019, EMNLP.

[182]  Ankur Bapna,et al.  Simple, Scalable Adaptation for Neural Machine Translation , 2019, EMNLP.