Multi-SimLex: A Large-Scale Evaluation of Multilingual and Crosslingual Lexical Semantic Similarity
暂无分享,去创建一个
Thierry Poibeau | Anna Korhonen | Simon Baker | Roi Reichart | Edoardo Maria Ponti | Ulla Petti | Olga Majewska | Ivan Vuli'c | Ira Leviant | Kelly Wing | Eden Bar | Matt Malone | A. Korhonen | T. Poibeau | Roi Reichart | Ivan Vulic | Simon Baker | Ira Leviant | E. Ponti | Olga Majewska | Ulla Petti | Kelly Wing | Eden Bar | Matt Malone
[1] David Evans,et al. Tracking and summarizing news on a daily basis with Columbia's Newsblaster , 2002 .
[2] Robert Forkel,et al. The World Atlas of Language Structures Online , 2009 .
[3] Ivan Vulic,et al. Unsupervised Cross-Lingual Representation Learning , 2019, ACL.
[4] Regina Barzilay,et al. Climbing the Tower of Babel: Unsupervised Multilingual Learning , 2010, ICML.
[5] Goran Glavas,et al. Semantic Specialization of Distributional Word Vectors , 2019, EMNLP/IJCNLP.
[6] Roy Schwartz,et al. Symmetric Patterns and Coordinations: Fast and Enhanced Representations of Verbs and Adjectives , 2016, HLT-NAACL.
[7] Dominik Schlechtweg,et al. A Wind of Change: Detecting and Evaluating Lexical Semantic Change across Times and Domains , 2019, ACL.
[8] Anna Korhonen,et al. Isomorphic Transfer of Syntactic Structures in Cross-Lingual NLP , 2018, ACL.
[9] Jonas Ardö,et al. The FLUXNET2015 dataset and the ONEFlux processing pipeline for eddy covariance data , 2020, Scientific Data.
[10] Mikel Artetxe,et al. On the Cross-lingual Transferability of Monolingual Representations , 2019, ACL.
[11] Z. Harris,et al. Methods in structural linguistics. , 1952 .
[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.
[13] Miriam Van Staden,et al. The semantic categories of cutting and breaking events: A crosslinguistic perspective , 2007 .
[14] Jouko Vankka,et al. Finnish resources for evaluating language model semantics , 2017, NODALIDA.
[15] Hervé Jégou,et al. Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion , 2018, EMNLP.
[16] Guillaume Lample,et al. Word Translation Without Parallel Data , 2017, ICLR.
[17] Jason Weston,et al. Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..
[18] Anders Søgaard,et al. A Survey of Cross-lingual Word Embedding Models , 2017, J. Artif. Intell. Res..
[19] Dorothy Holland,et al. Culture and cognition , 1987 .
[20] Alessandro Lenci,et al. Distributional Memory: A General Framework for Corpus-Based Semantics , 2010, CL.
[21] Mark Dredze,et al. Are All Languages Created Equal in Multilingual BERT? , 2020, REPL4NLP.
[22] Samuel L. Smith,et al. Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.
[23] Zornitsa Kozareva,et al. SemEval-2012 Task 7: Choice of Plausible Alternatives: An Evaluation of Commonsense Causal Reasoning , 2011, *SEMEVAL.
[24] Maosong Sun,et al. COS960: A Chinese Word Similarity Dataset of 960 Word Pairs , 2019, ArXiv.
[25] Goran Glavas,et al. How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions , 2019, ACL.
[26] Patrick Littell,et al. URIEL and lang2vec: Representing languages as typological, geographical, and phylogenetic vectors , 2017, EACL.
[27] Manaal Faruqui,et al. Cross-lingual Models of Word Embeddings: An Empirical Comparison , 2016, ACL.
[28] Ken-ichi Kawarabayashi,et al. Are Girls Neko or Shōjo? Cross-Lingual Alignment of Non-Isomorphic Embeddings with Iterative Normalization , 2019, ACL.
[29] Monojit Choudhury,et al. The State and Fate of Linguistic Diversity and Inclusion in the NLP World , 2020, ACL.
[30] Prakhar Gupta,et al. Learning Word Vectors for 157 Languages , 2018, LREC.
[31] Eneko Agirre,et al. Uncovering Divergent Linguistic Information in Word Embeddings with Lessons for Intrinsic and Extrinsic Evaluation , 2018, CoNLL.
[32] Felix Hill,et al. SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.
[33] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.
[34] Beth Levin,et al. English Verb Classes and Alternations: A Preliminary Investigation , 1993 .
[35] Steve Young,et al. Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .
[36] Gert Storms,et al. Word associations: Network and semantic properties , 2008, Behavior research methods.
[37] Isabelle Augenstein,et al. From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings , 2018, NAACL-HLT.
[38] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.
[39] Siddharth Patwardhan,et al. The Role of Context Types and Dimensionality in Learning Word Embeddings , 2016, NAACL.
[40] Martin Potthast,et al. CoNLL 2018 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2018, CoNLL.
[41] Roy Schwartz,et al. Symmetric Pattern Based Word Embeddings for Improved Word Similarity Prediction , 2015, CoNLL.
[42] Guillaume Lample,et al. Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning , 2016, NAACL.
[43] Elia Bruni,et al. Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..
[44] Martha Palmer,et al. Verb Semantics and Lexical Selection , 1994, ACL.
[45] Adam Lopez,et al. From Characters to Words to in Between: Do We Capture Morphology? , 2017, ACL.
[46] Veselin Stoyanov,et al. Unsupervised Cross-lingual Representation Learning at Scale , 2019, ACL.
[47] Graeme Hirst,et al. Evaluating WordNet-based Measures of Lexical Semantic Relatedness , 2006, CL.
[48] André Freitas,et al. SemR-11: A Multi-Lingual Gold-Standard for Semantic Similarity and Relatedness for Eleven Languages , 2018, LREC.
[49] Shafiq R. Joty,et al. Revisiting Adversarial Autoencoder for Unsupervised Word Translation with Cycle Consistency and Improved Training , 2019, NAACL.
[50] Ehud Rivlin,et al. Placing search in context: the concept revisited , 2002, TOIS.
[51] Goran Glavas,et al. Probing Pretrained Language Models for Lexical Semantics , 2020, EMNLP.
[52] Felix Hill,et al. SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.
[53] J. Trier. Der deutsche Wortschatz im Sinnbezirk des Verstandes : von den Anfängen bis zum Beginn des 13. Jahrhunderts , 1973 .
[54] Martine Vanhove,et al. From Polysemy to Semantic Change: Towards a Typology of Lexical Semantic Associations , 2008 .
[55] Susanne Vejdemo,et al. Lexical change often begins and ends in semantic peripheries , 2018, Pragmatics and Cognition.
[56] Mohammad Sadegh Rasooli,et al. Cross-Lingual Syntactic Transfer with Limited Resources , 2017, Transactions of the Association for Computational Linguistics.
[57] Tommi S. Jaakkola,et al. Gromov-Wasserstein Alignment of Word Embedding Spaces , 2018, EMNLP.
[58] Ryan Cotterell,et al. Towards Zero-shot Language Modeling , 2019, EMNLP.
[59] Danqi Chen,et al. A Fast and Accurate Dependency Parser using Neural Networks , 2014, EMNLP.
[60] Fan Yang,et al. XGLUE: A New Benchmark Dataset for Cross-lingual Pre-training, Understanding and Generation , 2020, EMNLP.
[61] Lu Chen,et al. Towards Universal Dialogue State Tracking , 2018, EMNLP.
[62] Agnieszka Mykowiecka,et al. SimLex-999 for Polish , 2018, LREC.
[63] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.
[64] Thomas A. Schreiber,et al. The University of South Florida free association, rhyme, and word fragment norms , 2004, Behavior research methods, instruments, & computers : a journal of the Psychonomic Society, Inc.
[65] R. Bro,et al. Centering and scaling in component analysis , 2003 .
[66] Jason Baldridge,et al. PAWS: Paraphrase Adversaries from Word Scrambling , 2019, NAACL.
[67] Eric Fosler-Lussier,et al. Adjusting Word Embeddings with Semantic Intensity Orders , 2016, Rep4NLP@ACL.
[68] Kawin Ethayarajh,et al. How Contextual are Contextualized Word Representations? Comparing the Geometry of BERT, ELMo, and GPT-2 Embeddings , 2019, EMNLP.
[69] Kevin Gimpel,et al. From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.
[70] Nigel Collier,et al. Card-660: Cambridge Rare Word Dataset - a Reliable Benchmark for Infrequent Word Representation Models , 2018, EMNLP 2018.
[71] Roi Reichart,et al. Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling , 2015 .
[72] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.
[73] Ponrudee Netisopakul,et al. Word Similarity Datasets for Thai: Construction and Evaluation , 2019, IEEE Access.
[74] Goran Glavas,et al. Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization , 2018, EMNLP.
[75] Steven Schockaert,et al. Improving Cross-Lingual Word Embeddings by Meeting in the Middle , 2018, EMNLP.
[76] Martha Palmer,et al. Verbnet: a broad-coverage, comprehensive verb lexicon , 2005 .
[77] Xinying Chen,et al. Classifying Languages by Dependency Structure. Typologies of Delexicalized Universal Dependency Treebanks , 2017, DepLing.
[78] Anna Korhonen,et al. A Systematic Study of Leveraging Subword Information for Learning Word Representations , 2019, NAACL.
[79] Jason Baldridge,et al. PAWS-X: A Cross-lingual Adversarial Dataset for Paraphrase Identification , 2019, EMNLP.
[80] J. Trier. Der deutsche Wortschatz im Sinnbezirk des Verstandes : die Geschichte eines Sprachlichen Feldes , 1931 .
[81] Silvia Bernardini,et al. The WaCky wide web: a collection of very large linguistically processed web-crawled corpora , 2009, Lang. Resour. Evaluation.
[82] Stephen Clark,et al. A Systematic Study of Semantic Vector Space Model Parameters , 2014, CVSC@EACL.
[83] John B. Lowe,et al. The Berkeley FrameNet Project , 1998, ACL.
[84] Roy Schwartz,et al. Automatic Selection of Context Configurations for Improved Class-Specific Word Representations , 2016, CoNLL.
[85] Graham Neubig,et al. XTREME: A Massively Multilingual Multi-task Benchmark for Evaluating Cross-lingual Generalization , 2020, ICML.
[86] Eneko Agirre,et al. Translation Artifacts in Cross-lingual Transfer Learning , 2020, EMNLP.
[87] Anna Korhonen,et al. On the Role of Seed Lexicons in Learning Bilingual Word Embeddings , 2016, ACL.
[88] Laure Thompson,et al. The strange geometry of skip-gram with negative sampling , 2017, EMNLP.
[89] Eunsol Choi,et al. TyDi QA: A Benchmark for Information-Seeking Question Answering in Typologically Diverse Languages , 2020, Transactions of the Association for Computational Linguistics.
[90] Célia Michotey,et al. The GenTree Dendroecological Collection, tree-ring and wood density data from seven tree species across Europe , 2020, Scientific Data.
[91] Noah A. Smith,et al. Many Languages, One Parser , 2016, TACL.
[92] Goran Glavas,et al. Specializing Distributional Vectors of All Words for Lexical Entailment , 2019, RepL4NLP@ACL.
[93] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .
[94] Roi Reichart,et al. Deep Contextualized Self-training for Low Resource Dependency Parsing , 2019, Transactions of the Association for Computational Linguistics.
[95] Mona T. Diab,et al. Context-Aware Cross-Lingual Mapping , 2019, NAACL.
[96] Goran Glavas,et al. Explicit Retrofitting of Distributional Word Vectors , 2018, ACL.
[97] Tomas Mikolov,et al. Advances in Pre-Training Distributed Word Representations , 2017, LREC.
[98] Goran Glavas,et al. Informing Unsupervised Pretraining with External Linguistic Knowledge , 2019, ArXiv.
[99] Anna Korhonen,et al. An Unsupervised Model for Instance Level Subcategorization Acquisition , 2014, EMNLP.
[100] Zellig S. Harris,et al. Methods in structural linguistics. , 1952 .
[101] R. A. van den Berg,et al. Centering, scaling, and transformations: improving the biological information content of metabolomics data , 2006, BMC Genomics.
[102] Anna Korhonen,et al. Evaluation by Association: A Systematic Study of Quantitative Word Association Evaluation , 2017, EACL.
[103] Veselin Stoyanov,et al. Emerging Cross-lingual Structure in Pretrained Language Models , 2020, ACL.
[104] Dan Klein,et al. Multilingual Alignment of Contextual Word Representations , 2020, ICLR.
[105] Jason Eisner,et al. Lexical Semantics , 2020, The Handbook of English Linguistics.
[106] Roberto Navigli,et al. BabelDomains: Large-Scale Domain Labeling of Lexical Resources , 2017, EACL.
[107] Christopher D. Manning,et al. Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.
[108] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.
[109] Goran Glavas,et al. Do We Really Need Fully Unsupervised Cross-Lingual Embeddings? , 2019, EMNLP.
[110] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[111] Mamoru Komachi,et al. Construction of a Japanese Word Similarity Dataset , 2017, LREC.
[112] J. R. Firth,et al. A Synopsis of Linguistic Theory, 1930-1955 , 1957 .
[113] Eva Schlinger,et al. How Multilingual is Multilingual BERT? , 2019, ACL.
[114] Georgiana Dinu,et al. Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning , 2015, ACL.
[115] Goran Glavas,et al. Discriminating between Lexico-Semantic Relations with the Specialization Tensor Model , 2018, NAACL.
[116] Sebastian Riedel,et al. MLQA: Evaluating Cross-lingual Extractive Question Answering , 2019, ACL.
[117] Eneko Agirre,et al. Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.
[118] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[119] Goran Glavas,et al. Multilingual and Cross-Lingual Graded Lexical Entailment , 2019, ACL.
[120] Kevin Gimpel,et al. ALBERT: A Lite BERT for Self-supervised Learning of Language Representations , 2019, ICLR.
[121] Thierry Poibeau,et al. Modeling Language Variation and Universals: A Survey on Typological Linguistics for Natural Language Processing , 2018, Computational Linguistics.
[122] Neville Ryant,et al. A large-scale classification of English verbs , 2008, Lang. Resour. Evaluation.
[123] Stephen Clark,et al. Specializing Word Embeddings for Similarity or Relatedness , 2015, EMNLP.
[124] Alexandre François,et al. Semantic maps and the typology of colexification: Intertwining polysemous networks across languages , 2008 .
[125] Olcay Taner Yildiz,et al. AnlamVer: Semantic Model Evaluation Dataset for Turkish - Word Similarity and Relatedness , 2018, COLING.
[126] Graham Neubig,et al. Learning Language Representations for Typology Prediction , 2017, EMNLP.
[127] Michael Meeuwis,et al. Order of subject, object, and verb , 2013 .
[128] Jörg Tiedemann,et al. What Do Language Representations Really Represent? , 2019, Computational Linguistics.
[129] Patrick Pantel,et al. From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..
[130] Alexandros Nanopoulos,et al. Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..
[131] Nigel Collier,et al. SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity , 2017, *SEMEVAL.
[132] Yoshua Bengio,et al. Learning to Understand Phrases by Embedding the Dictionary , 2015, TACL.
[133] Jörg Tiedemann,et al. Continuous multilinguality with language vectors , 2016, EACL.
[134] Graham Neubig,et al. Cross-Lingual Word Embeddings for Low-Resource Language Modeling , 2017, EACL.
[135] Tamir Hazan,et al. Perturbation Based Learning for Structured NLP Tasks with Application to Dependency Parsing , 2019, Transactions of the Association for Computational Linguistics.
[136] Yi Zhu,et al. On the Importance of Subword Information for Morphological Tasks in Truly Low-Resource Languages , 2019, CoNLL.
[137] Anna Korhonen,et al. Specializing Unsupervised Pretraining Models for Word-Level Semantic Similarity , 2019, COLING.
[138] Tiago Tresoldi,et al. The Database of Cross-Linguistic Colexifications, reproducible analysis of cross-linguistic polysemies , 2020, Scientific Data.
[139] Eneko Agirre,et al. Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations , 2018, AAAI.
[140] Michael Meeuwis,et al. 'Green' and 'blue' , 2013 .
[141] Sebastian Ruder,et al. A survey of cross-lingual embedding models , 2017, ArXiv.
[142] Quoc V. Le,et al. Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.
[143] Steven Schockaert,et al. On the Robustness of Unsupervised and Semi-supervised Cross-lingual Word Embedding Learning , 2020, LREC.
[144] N. Mantel. The detection of disease clustering and a generalized regression approach. , 1967, Cancer research.
[145] Guillaume Lample,et al. XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.
[146] Marco Saerens,et al. Centering Similarity Measures to Reduce Hubs , 2013, EMNLP.
[147] Pramod Viswanath,et al. All-but-the-Top: Simple and Effective Postprocessing for Word Representations , 2017, ICLR.
[148] Eneko Agirre,et al. A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.
[149] Mark Dredze,et al. Beto, Bentz, Becas: The Surprising Cross-Lingual Effectiveness of BERT , 2019, EMNLP.
[150] Qianchu Liu,et al. Investigating Cross-Lingual Alignment Methods for Contextualized Embeddings with Token-Level Evaluation , 2019, CoNLL.
[151] Lior Wolf,et al. Non-Adversarial Unsupervised Word Translation , 2018, EMNLP.
[152] Claire Cardie,et al. Unsupervised Multilingual Word Embeddings , 2018, EMNLP.
[153] Graham Neubig,et al. Choosing Transfer Languages for Cross-Lingual Learning , 2019, ACL.
[154] George A. Miller,et al. WordNet: A Lexical Database for English , 1995, HLT.
[155] Eneko Agirre,et al. A Study on Similarity and Relatedness Using Distributional and WordNet-based Approaches , 2009, NAACL.
[156] Dong Wang,et al. Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.
[157] Felix Hill,et al. HyperLex: A Large-Scale Evaluation of Graded Lexical Entailment , 2016, CL.
[158] Goran Glavas,et al. From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers , 2020, ArXiv.
[159] Qianchu Liu,et al. XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning , 2020, EMNLP.
[160] Tapio Salakoski,et al. Multilingual is not enough: BERT for Finnish , 2019, ArXiv.
[161] Mike Schuster,et al. Japanese and Korean voice search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).
[162] Anders Søgaard,et al. On the Limitations of Unsupervised Bilingual Dictionary Induction , 2018, ACL.
[163] Peter D. Turney. Domain and Function: A Dual-Space Model of Semantic Relations and Compositions , 2012, J. Artif. Intell. Res..
[164] Roberto Navigli,et al. A Framework for the Construction of Monolingual and Cross-lingual Word Similarity Datasets , 2015, ACL.
[165] Jeffrey S. Gruber,et al. Lexical structures in syntax and semantics , 1976 .
[166] David Vandyke,et al. Counter-fitting Word Vectors to Linguistic Constraints , 2016, NAACL.
[167] Martha Palmer,et al. Extending a Verb-lexicon Using a Semantically Annotated Corpus , 2004, LREC.
[168] Guillaume Lample,et al. Cross-lingual Language Model Pretraining , 2019, NeurIPS.
[169] Anna Korhonen,et al. Cross-lingual Semantic Specialization via Lexical Relation Induction , 2019, EMNLP.
[170] M. Lucas,et al. Semantic priming without association: A meta-analytic review , 2000, Psychonomic bulletin & review.
[171] Benjamin Lecouteux,et al. FlauBERT: Unsupervised Language Model Pre-training for French , 2020, LREC.
[172] Gökhan Tür,et al. Intent detection using semantically enriched word embeddings , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).
[173] Jonathan Pool,et al. PanLex: Building a Resource for Panlingual Lexical Translation , 2014, LREC.
[174] Magnus Sahlgren,et al. The Word-Space Model: using distributional analysis to represent syntagmatic and paradigmatic relations between words in high-dimensional vector spaces , 2006 .
[175] Anna Korhonen,et al. Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.
[176] Virginia R. de Sa,et al. An Empirical Study on Post-processing Methods for Word Embeddings , 2019, ArXiv.
[177] Dan Roth,et al. Cross-Lingual Ability of Multilingual BERT: An Empirical Study , 2019, ICLR.
[178] Steffen Staab,et al. Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..
[179] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.
[180] Omer Levy,et al. Dependency-Based Word Embeddings , 2014, ACL.
[181] Daniel Kondratyuk,et al. 75 Languages, 1 Model: Parsing Universal Dependencies Universally , 2019, EMNLP.
[182] Ankur Bapna,et al. Simple, Scalable Adaptation for Neural Machine Translation , 2019, EMNLP.