A Survey of Cross-lingual Word Embedding Models

Cross-lingual representations of words enable us to reason about word meaning in multilingual contexts and are a key facilitator of cross-lingual transfer when developing natural language processing models for low-resource languages. In this survey, we provide a comprehensive typology of cross-lingual word embedding models. We compare their data requirements and objective functions. The recurring theme of the survey is that many of the models presented in the literature optimize for the same objectives, and that seemingly different models are often equivalent, modulo optimization strategies, hyper-parameters, and such. We also discuss the different ways cross-lingual word embeddings are evaluated, as well as future challenges and research horizons.

[1]  John B. Goodenough,et al.  Contextual correlates of synonymy , 1965, CACM.

[2]  P. Schönemann,et al.  A generalized solution of the orthogonal procrustes problem , 1966 .

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[5]  Kenneth Ward Church,et al.  Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.

[6]  Richard A. Harshman,et al.  Indexing by Latent Semantic Analysis , 1990, J. Am. Soc. Inf. Sci..

[7]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[8]  Jerome L. Myers,et al.  Research Design and Statistical Analysis , 1991 .

[9]  Douglas McKee,et al.  A Language-Independent Anaphora Resolution System for Understanding Multilingual Texts , 1993, ACL.

[10]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[11]  Peter Eades,et al.  Nearest Neighbour Graph Realizability is NP-hard , 1995, LATIN.

[12]  T. Landauer,et al.  A Solution to Plato's Problem: The Latent Semantic Analysis Theory of Acquisition, Induction, and Representation of Knowledge. , 1997 .

[13]  Susan T. Dumais,et al.  Automatic Cross-Language Information Retrieval Using Latent Semantic Indexing , 1998 .

[14]  Reinhard Rapp,et al.  Automatic Identification of Word Translations from Unrelated English and German Corpora , 1999, ACL.

[15]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[16]  Tanja Schultz,et al.  Language-independent and language-adaptive acoustic modeling for speech recognition , 2001, Speech Commun..

[17]  Ehud Rivlin,et al.  Placing search in context: the concept revisited , 2002, TOIS.

[18]  Erik F. Tjong Kim Sang,et al.  Introduction to the CoNLL-2003 Shared Task: Language-Independent Named Entity Recognition , 2003, CoNLL.

[19]  Ted Pedersen,et al.  An Evaluation Exercise for Word Alignment , 2003, ParallelTexts@NAACL-HLT.

[20]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[21]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[22]  Jean-Michel Renders,et al.  A Geometric View on Bilingual Lexicon Extraction from Comparable Corpora , 2004, ACL.

[23]  Iryna Gurevych,et al.  Using the Structure of a Conceptual Network in Computing Semantic Relatedness , 2005, IJCNLP.

[24]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[25]  José B. Mariño,et al.  Guidelines for Word Alignment Evaluation and Manual Alignment , 2005, Lang. Resour. Evaluation.

[26]  Dragos Stefan Munteanu,et al.  Extracting Parallel Sub-Sentential Fragments from Non-Parallel Corpora , 2006, ACL.

[27]  Sabine Buchholz,et al.  CoNLL-X Shared Task on Multilingual Dependency Parsing , 2006, CoNLL.

[28]  Mitchell P. Marcus,et al.  OntoNotes: The 90% Solution , 2006, NAACL.

[29]  Evgeniy Gabrilovich,et al.  Overcoming the Brittleness Bottleneck using Wikipedia: Enhancing Text Categorization with Encyclopedic Knowledge , 2006, AAAI.

[30]  Rada Mihalcea,et al.  Wikify!: linking documents to encyclopedic knowledge , 2007, CIKM '07.

[31]  Dan Klein,et al.  Learning Bilingual Lexicons from Monolingual Corpora , 2008, ACL.

[32]  Yves Peirsman,et al.  Semantic relations in bilingual lexicons , 2011, TSLP.

[33]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[34]  João Graça,et al.  Building a Golden Collection of Parallel Multi-Language Word Alignment , 2008, LREC.

[35]  Claudio Gentile,et al.  Linear Algorithms for Online Multitask Classification , 2010, COLT.

[36]  Philip Resnik,et al.  Cross-Language Parser Adaptation between Related Languages , 2008, IJCNLP.

[37]  Regina Barzilay,et al.  Multilingual Part-of-Speech Tagging: Two Unsupervised Approaches , 2009, J. Artif. Intell. Res..

[38]  Andrew McCallum,et al.  Polylingual Topic Models , 2009, EMNLP.

[39]  David M. Blei,et al.  Multilingual Topic Models for Unaligned Text , 2009, UAI.

[40]  Rada Mihalcea,et al.  Cross-lingual Semantic Relatedness Using Encyclopedic Knowledge , 2009, EMNLP.

[41]  Philippe Langlais,et al.  Revisiting Context-based Projection Methods for Term-Translation Spotting in Comparable Corpora , 2010, COLING.

[42]  John C. Platt,et al.  Translingual Document Representations from Discriminative Projections , 2010, EMNLP.

[43]  Philip Resnik,et al.  Holistic Sentiment Analysis Across Languages: Multilingual Supervised Latent Dirichlet Allocation , 2010, EMNLP.

[44]  ChengXiang Zhai,et al.  Cross-Lingual Latent Topic Extraction , 2010, ACL.

[45]  Yves Peirsman,et al.  Cross-lingual Induction of Selectional Preferences with Bilingual Vector Spaces , 2010, NAACL.

[46]  Hal Daumé,et al.  Extracting Multilingual Topics from Unaligned Comparable Corpora , 2010, ECIR.

[47]  Regina Barzilay,et al.  Climbing the Tower of Babel: Unsupervised Multilingual Learning , 2010, ICML.

[48]  Ari Rappoport,et al.  Bilingual Lexicon Generation Using Non-Aligned Signatures , 2010, ACL.

[49]  Yoshua Bengio,et al.  Word Representations: A Simple and General Method for Semi-Supervised Learning , 2010, ACL.

[50]  Alexandros Nanopoulos,et al.  Hubs in Space: Popular Nearest Neighbors in High-Dimensional Data , 2010, J. Mach. Learn. Res..

[51]  Benno Stein,et al.  Cross-Language Text Classification Using Structural Correspondence Learning , 2010, ACL.

[52]  Noah A. Smith,et al.  Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance , 2011, EMNLP.

[53]  Lars Ahrenberg,et al.  A Gold Standard for English-Swedish Word Alignment , 2011, NODALIDA.

[54]  Slav Petrov,et al.  Multi-Source Transfer of Delexicalized Dependency Parsers , 2011, EMNLP.

[55]  Marie-Francine Moens,et al.  Identifying Word Translations from Comparable Corpora Using Latent Topic Models , 2011, ACL.

[56]  Slav Petrov,et al.  Unsupervised Part-of-Speech Tagging with Bilingual Graph-Based Projections , 2011, ACL.

[57]  Marie-Francine Moens,et al.  Knowledge Transfer across Multilingual Corpora via Latent Topics , 2011, PAKDD.

[58]  Diana Inkpen,et al.  Comparison of Semantic Similarity for Different Languages Using the Google n-gram Corpus and Second-Order Co-occurrence Measures , 2011, Canadian Conference on AI.

[59]  Anders Søgaard Data point selection for cross-language adaptation of dependency parsers , 2011, ACL.

[60]  Benjamin Van Durme,et al.  Learning Bilingual Lexicons Using the Visual Similarity of Labeled Web Images , 2011, IJCAI.

[61]  Aapo Hyvärinen,et al.  Noise-Contrastive Estimation of Unnormalized Statistical Models, with Applications to Natural Image Statistics , 2012, J. Mach. Learn. Res..

[62]  Eric P. Xing,et al.  Symmetric Correspondence Topic Models for Multilingual Text Analysis , 2012, NIPS.

[63]  Yee Whye Teh,et al.  A fast and simple algorithm for training neural probabilistic language models , 2012, ICML.

[64]  Dan Klein,et al.  Syntactic Transfer Using a Bilingual Lexicon , 2012, EMNLP-CoNLL.

[65]  Jakob Uszkoreit,et al.  Cross-lingual Word Clusters for Direct Transfer of Linguistic Structure , 2012, NAACL.

[66]  Taro Watanabe,et al.  Bilingual Lexicon Extraction from Comparable Corpora Using Label Propagation , 2012, EMNLP.

[67]  Ivan Titov,et al.  Inducing Crosslingual Distributed Representations of Words , 2012, COLING.

[68]  Philipp Cimiano,et al.  Exploiting Wikipedia for cross-lingual and multilingual information retrieval , 2012, Data Knowl. Eng..

[69]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[70]  Manaal Faruqui,et al.  An Information Theoretic Approach to Bilingual Word Clustering , 2013, ACL.

[71]  Christopher D. Manning,et al.  Bilingual Word Embeddings for Phrase-Based Machine Translation , 2013, EMNLP.

[72]  Christopher D. Manning,et al.  Better Word Representations with Recursive Neural Networks for Morphology , 2013, CoNLL.

[73]  Marie-Francine Moens,et al.  A Study on Bootstrapping Bilingual Vector Spaces from Non-Parallel Data (and Nothing Else) , 2013, EMNLP.

[74]  Noah A. Smith,et al.  A Simple, Fast, and Effective Reparameterization of IBM Model 2 , 2013, NAACL.

[75]  Ivan Titov,et al.  Cross-lingual Transfer of Semantic Role Labeling Models , 2013, ACL.

[76]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[77]  Dan Roth,et al.  Relational Inference for Wikification , 2013, EMNLP.

[78]  Phil Blunsom,et al.  Multilingual Distributed Representations without Word Alignment , 2013, ICLR 2014.

[79]  Joakim Nivre,et al.  Universal Dependency Annotation for Multilingual Parsing , 2013, ACL.

[80]  Francis Bond,et al.  Linking and Extending an Open Multilingual Wordnet , 2013, ACL.

[81]  Marie-Francine Moens,et al.  Cross-Lingual Semantic Similarity of Words as the Similarity of Their Semantic Word Responses , 2013, NAACL.

[82]  Peter Young,et al.  From image descriptions to visual denotations: New similarity metrics for semantic inference over event descriptions , 2014, TACL.

[83]  Phil Blunsom,et al.  Multilingual Models for Compositional Distributed Semantics , 2014, ACL.

[84]  Min Xiao,et al.  Distributed Word Representation Learning for Cross-Lingual Dependency Parsing , 2014, CoNLL.

[85]  Phil Blunsom,et al.  Learning Bilingual Word Representations by Marginalizing Alignments , 2014, ACL.

[86]  Manaal Faruqui,et al.  Community Evaluation and Exchange of Word Vectors at wordvectors.org , 2014, ACL.

[87]  Hugo Larochelle,et al.  Learning Multilingual Word Representations using a Bag-of-Words Autoencoder , 2014, ArXiv.

[88]  Hugo Larochelle,et al.  An Autoencoder Approach to Learning Bilingual Word Representations , 2014, NIPS.

[89]  Matthew Henderson,et al.  Robust dialog state tracking using delexicalised recurrent neural networks and unsupervised adaptation , 2014, 2014 IEEE Spoken Language Technology Workshop (SLT).

[90]  Elia Bruni,et al.  Multimodal Distributional Semantics , 2014, J. Artif. Intell. Res..

[91]  Manaal Faruqui,et al.  Improving Vector Space Word Representations Using Multilingual Correlation , 2014, EACL.

[92]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[93]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[94]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[95]  Akiko Aizawa,et al.  Leveraging Monolingual Data for Crosslingual Compositional Word Representations , 2014, ICLR.

[96]  Stephen Clark,et al.  Visual Bilingual Lexicon Induction with Transferred ConvNet Features , 2015, EMNLP.

[97]  Guillaume Lample,et al.  Evaluation of Word Vector Representations by Subspace Alignment , 2015, EMNLP.

[98]  Shankar Kumar,et al.  Multilingual Open Relation Extraction Using Cross-lingual Projection , 2015, NAACL.

[99]  Roi Reichart,et al.  Separated by an Un-common Language: Towards Judgment Language Informed Vector Space Modeling , 2015 .

[100]  Kevin Gimpel,et al.  Deep Multilingual Correlation for Improved Word Embeddings , 2015, NAACL.

[101]  Guillaume Wenzek,et al.  Trans-gram, Fast Cross-lingual Word-embeddings , 2015, EMNLP.

[102]  Georgiana Dinu,et al.  Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning , 2015, ACL.

[103]  Trevor Cohn,et al.  Cross-lingual Transfer for Unsupervised Dependency Parsing Without Parallel Data , 2015, CoNLL.

[104]  Stephen Clark,et al.  Multi- and Cross-Modal Semantics Beyond Vision: Grounding in Auditory Perception , 2015, EMNLP.

[105]  Marie-Francine Moens,et al.  C-BiLDA extracting cross-lingual topics from non-parallel texts by distinguishing shared from unshared content , 2015, Data Mining and Knowledge Discovery.

[106]  Anders Søgaard,et al.  Simple task-specific bilingual word embeddings , 2015, NAACL.

[107]  Marie-Francine Moens,et al.  Monolingual and Cross-Lingual Information Retrieval Models Based on (Bilingual) Word Embeddings , 2015, SIGIR.

[108]  David Yarowsky,et al.  Cross-lingual Dependency Parsing Based on Distributed Representations , 2015, ACL.

[109]  Omer Levy,et al.  Improving Distributional Similarity with Lessons Learned from Word Embeddings , 2015, TACL.

[110]  Wang Ling,et al.  Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation , 2015, EMNLP.

[111]  Daniel Jurafsky,et al.  Do Multi-Sense Embeddings Improve Natural Language Understanding? , 2015, EMNLP.

[112]  Dong Wang,et al.  Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.

[113]  Barbara Plank,et al.  Inverted indexing for cross-lingual NLP , 2015, ACL.

[114]  Felix Hill,et al.  SimLex-999: Evaluating Semantic Models With (Genuine) Similarity Estimation , 2014, CL.

[115]  Dirk Hovy,et al.  If all you have is a bit of the Bible: Learning POS taggers for truly low-resource languages , 2015, ACL.

[116]  Christo Kirov,et al.  A Language-Independent Feature Schema for Inflectional Morphology , 2015, ACL.

[117]  Nikos D. Sidiropoulos,et al.  Translation Invariant Word Embeddings , 2015, EMNLP.

[118]  Roberto Navigli,et al.  A Framework for the Construction of Monolingual and Cross-lingual Word Similarity Datasets , 2015, ACL.

[119]  Marie-Francine Moens,et al.  Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications , 2015, Inf. Process. Manag..

[120]  Thorsten Joachims,et al.  Evaluation methods for unsupervised word embeddings , 2015, EMNLP.

[121]  Kevin Gimpel,et al.  From Paraphrase Database to Compositional Paraphrase Model and Back , 2015, Transactions of the Association for Computational Linguistics.

[122]  Georgiana Dinu,et al.  Improving zero-shot learning by mitigating the hubness problem , 2014, ICLR.

[123]  Christopher D. Manning,et al.  Bilingual Word Representations with Monolingual Quality in Mind , 2015, VS@HLT-NAACL.

[124]  Anders Søgaard,et al.  Any-language frame-semantic parsing , 2015, EMNLP.

[125]  Angeliki Lazaridou,et al.  Combining Language and Vision with a Multimodal Skip-gram Model , 2015, NAACL.

[126]  Zhiyuan Liu,et al.  Learning Cross-lingual Word Embeddings via Matrix Co-factorization , 2015, ACL.

[127]  Christopher D. Manning,et al.  Learning Distributed Representations for Multilingual Text Sequences , 2015, VS@HLT-NAACL.

[128]  Yoshua Bengio,et al.  BilBOWA: Fast Bilingual Distributed Representations without Word Alignments , 2014, ICML.

[129]  Pushpak Bhattacharyya,et al.  Sharing Network Parameters for Crosslingual Named Entity Recognition , 2016, ArXiv.

[130]  Noah A. Smith,et al.  Many Languages, One Parser , 2016, TACL.

[131]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[132]  Guillaume Lample,et al.  Neural Architectures for Named Entity Recognition , 2016, NAACL.

[133]  Regina Barzilay,et al.  Ten Pairs to Tag – Multilingual POS Tagging via Coarse Mapping between Embeddings , 2016, NAACL.

[134]  Achim Rettinger,et al.  Bilingual Word Embeddings from Parallel and Non-parallel Corpora for Cross-Language Text Classification , 2016, NAACL.

[135]  Hiroshi Kanayama,et al.  Learning Crosslingual Word Embeddings without Bilingual Corpora , 2016, EMNLP.

[136]  José A. R. Fonollosa,et al.  Character-based Neural Machine Translation , 2016, ACL.

[137]  Inducing Bilingual Lexica From Non-Parallel Data With Earth Mover's Distance Regularization , 2016, COLING.

[138]  Barbara Plank,et al.  Multilingual Projection for Parsing Truly Low-Resource Languages , 2016, TACL.

[139]  Guillaume Lample,et al.  Massively Multilingual Word Embeddings , 2016, ArXiv.

[140]  Antonio Valerio Miceli Barone Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders , 2016, Rep4NLP@ACL.

[141]  Guillaume Lample,et al.  Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning , 2016, NAACL.

[142]  Adi Shalev,et al.  Word Embeddings and Their Use In Sentence Classification Tasks , 2016, ArXiv.

[143]  David Yarowsky,et al.  A Representation Learning Framework for Multi-Source Transfer Parsing , 2016, AAAI.

[144]  Marie-Francine Moens,et al.  Bilingual Distributed Word Representations from Document-Aligned Comparable Data , 2015, J. Artif. Intell. Res..

[145]  Marie-Francine Moens,et al.  Multi-Modal Representations for Improved Bilingual Lexicon Learning , 2016, ACL.

[146]  Eneko Agirre,et al.  Learning principled bilingual mappings of word embeddings while preserving monolingual invariance , 2016, EMNLP.

[147]  Yoshua Bengio,et al.  Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism , 2016, NAACL.

[148]  Felix Hill,et al.  SimVerb-3500: A Large-Scale Evaluation Set of Verb Similarity , 2016, EMNLP.

[149]  Dan Roth,et al.  Cross-lingual Wikification Using Multilingual Embeddings , 2016, NAACL.

[150]  W. Bruce Croft,et al.  Embedding-based Query Language Models , 2016, ICTIR.

[151]  Balaraman Ravindran,et al.  Bridge Correlational Neural Networks for Multilingual Multimodal Representation Learning , 2015, NAACL.

[152]  Manaal Faruqui,et al.  Cross-lingual Models of Word Embeddings: An Empirical Comparison , 2016, ACL.

[153]  Anna Korhonen,et al.  On the Role of Seed Lexicons in Learning Bilingual Word Embeddings , 2016, ACL.

[154]  Marine Carpuat,et al.  Sparse Bilingual Word Representations for Cross-lingual Lexical Entailment , 2016, HLT-NAACL.

[155]  Anders Søgaard,et al.  Evaluating word embeddings with fMRI and eye-tracking , 2016, RepEval@ACL.

[156]  Adam Tauman Kalai,et al.  Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[157]  Sampo Pyysalo,et al.  Universal Dependencies v1: A Multilingual Treebank Collection , 2016, LREC.

[158]  Xuanjing Huang,et al.  Recurrent Neural Network for Text Classification with Multi-Task Learning , 2016, IJCAI.

[159]  Rui Zhang,et al.  Dependency Sensitive Convolutional Neural Networks for Modeling Sentences and Documents , 2016, NAACL.

[160]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[161]  Parminder Bhatia,et al.  Morphological Priors for Probabilistic Neural Word Embeddings , 2016, EMNLP.

[162]  Yulia Tsvetkov,et al.  Problems With Evaluation of Word Embeddings Using Word Similarity Tasks , 2016, RepEval@ACL.

[163]  Samuel L. Smith,et al.  Offline bilingual word vectors, orthogonal transformations and the inverted softmax , 2017, ICLR.

[164]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[165]  Omer Levy,et al.  A Strong Baseline for Learning Cross-Lingual Word Embeddings from Sentence Alignments , 2016, EACL.

[166]  Eneko Agirre,et al.  Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[167]  Chris Callison-Burch,et al.  Learning Translations via Matrix Completion , 2017, EMNLP.

[168]  Anna Korhonen,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017, TACL.

[169]  Pascal Denis,et al.  Delexicalized Word Embeddings for Cross-lingual Dependency Parsing , 2017, EACL.

[170]  Steve Young,et al.  Semantic Specialization of Distributional Word Vector Spaces using Monolingual and Cross-Lingual Constraints , 2017 .

[171]  Trevor Cohn,et al.  Model Transfer for Tagging Low-resource Languages using a Bilingual Dictionary , 2017, ACL.

[172]  Meng Zhang,et al.  Earth Mover’s Distance Minimization for Unsupervised Bilingual Lexicon Induction , 2017, EMNLP.

[173]  Frank Keller,et al.  Image Pivoting for Learning Multilingual Multimodal Representations , 2017, EMNLP.

[174]  Meng Zhang,et al.  Adversarial Training for Unsupervised Bilingual Lexicon Induction , 2017, ACL.

[175]  Graham Neubig,et al.  Cross-Lingual Word Embeddings for Low-Resource Language Modeling , 2017, EACL.

[176]  Nigel Collier,et al.  SemEval-2017 Task 2: Multilingual and Cross-lingual Semantic Word Similarity , 2017, *SEMEVAL.

[177]  Nick Campbell,et al.  Multilingual Multi-modal Embeddings for Natural Language Processing , 2017, ArXiv.

[178]  Nizar Habash,et al.  CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies , 2017, CoNLL.

[179]  Vikram Pudi,et al.  Injecting Word Embeddings with Another Language’s Resource : An Application of Bilingual Embeddings , 2017, IJCNLP.

[180]  Bhaskar Mitra,et al.  Neural Models for Information Retrieval , 2017, ArXiv.

[181]  Anders Søgaard,et al.  Cross-lingual and cross-domain discourse segmentation of entire documents , 2017, ACL.

[182]  Roy Schwartz,et al.  Automatic Selection of Context Configurations for Improved Class-Specific Word Representations , 2016, CoNLL.

[183]  Hiroshi Kanayama,et al.  Multilingual Training of Crosslingual Word Embeddings , 2017, EACL.

[184]  Desmond Elliott,et al.  Imagination Improves Multimodal Translation , 2017, IJCNLP.

[185]  Miles Osborne,et al.  Statistical Machine Translation , 2010, Encyclopedia of Machine Learning and Data Mining.

[186]  Anders Søgaard,et al.  Cross-lingual RST Discourse Parsing , 2017, EACL.

[187]  Ivan Vulic Cross-Lingual Syntactically Informed Distributed Word Representations , 2017, EACL.

[188]  Tsung-Hsien Wen,et al.  Neural Belief Tracker: Data-Driven Dialogue State Tracking , 2016, ACL.

[189]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[190]  Chris Callison-Burch,et al.  A Comprehensive Analysis of Bilingual Lexicon Induction , 2017, CL.

[191]  Grzegorz Kondrak,et al.  Bootstrapping Unsupervised Bilingual Lexicon Induction , 2017, EACL.

[192]  Marie-Francine Moens,et al.  Bilingual Lexicon Induction by Learning to Combine Word-Level and Character-Level Representations , 2017, EACL.

[193]  Ryan Cotterell,et al.  A Discriminative Latent-Variable Model for Bilingual Lexicon Induction , 2018, EMNLP.

[194]  Goran Glavas,et al.  Explicit Retrofitting of Distributional Word Vectors , 2018, ACL.

[195]  Tommi S. Jaakkola,et al.  Gromov-Wasserstein Alignment of Word Embedding Spaces , 2018, EMNLP.

[196]  Guillaume Lample,et al.  XNLI: Evaluating Cross-lingual Sentence Representations , 2018, EMNLP.

[197]  Sebastian Ruder,et al.  Universal Language Model Fine-tuning for Text Classification , 2018, ACL.

[198]  Goran Glavas,et al.  Unsupervised Cross-Lingual Information Retrieval Using Monolingual Data Only , 2018, SIGIR.

[199]  Hervé Jégou,et al.  Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion , 2018, EMNLP.

[200]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[201]  Samuel R. Bowman,et al.  A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[202]  Ndapandula Nakashole,et al.  Characterizing Departures from Linearity in Word Translation , 2018, ACL.

[203]  Lior Wolf,et al.  An Iterative Closest Point Method for Unsupervised Word Translation , 2018, ArXiv.

[204]  Goran Glavas,et al.  Post-Specialisation: Retrofitting Vectors of Words Unseen in Lexical Resources , 2018, NAACL.

[205]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[206]  Goran Glavas,et al.  Adversarial Propagation and Zero-Shot Cross-Lingual Transfer of Word Vector Specialization , 2018, EMNLP.

[207]  Eneko Agirre,et al.  Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations , 2018, AAAI.

[208]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[209]  Anders Søgaard,et al.  On the Limitations of Unsupervised Bilingual Dictionary Induction , 2018, ACL.

[210]  Guillaume Lample,et al.  Unsupervised Machine Translation Using Monolingual Corpora Only , 2017, ICLR.

[211]  Eneko Agirre,et al.  Analyzing the Limitations of Cross-lingual Word Embedding Mappings , 2019, ACL.

[212]  Holger Schwenk,et al.  Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond , 2018, Transactions of the Association for Computational Linguistics.

[213]  Eneko Agirre,et al.  Bilingual Lexicon Induction through Unsupervised Machine Translation , 2019, ACL.

[214]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[215]  Goran Glavas,et al.  How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions , 2019, ACL.

[216]  Guillaume Lample,et al.  Cross-lingual Language Model Pretraining , 2019, NeurIPS.