Morphologically Aware Word-Level Translation

We propose a novel morphologically aware probability model for bilingual lexicon induction, which jointly models lexeme translation and inflectional morphology in a structured way. Our model exploits the basic linguistic intuition that the lexeme is the key lexical unit of meaning, while inflectional morphology provides additional syntactic information. This approach leads to substantial performance improvements - 19% average improvement in accuracy across 6 language pairs over the state of the art in the supervised setting and 16% in the weakly supervised setting. As another contribution, we highlight issues associated with modern BLI that stem from ignoring inflectional morphology, and propose three suggestions for improving the task.

[1]  Eneko Agirre,et al.  Analyzing the Limitations of Cross-lingual Word Embedding Mappings , 2019, ACL.

[2]  Rochelle Lieber,et al.  Word frequency distributions and lexical semantics , 1996, Comput. Humanit..

[3]  R. Harald Baayen,et al.  Quantitative aspects of morphological productivity , 1992 .

[4]  Steven Pinker,et al.  Regular and Irregular Morphology and the Psychological Status of Rules of Grammar , 1991 .

[5]  Steven Pinker,et al.  Words and rules , 1998 .

[6]  Kathleen McKeown,et al.  Translating Collocations for Use in Bilingual Lexicons , 1994, HLT.

[7]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[8]  Georgiana Dinu,et al.  Hubness and Pollution: Delving into Cross-Space Mapping for Zero-Shot Learning , 2015, ACL.

[9]  Francis Bond,et al.  A Survey of WordNets and their Licenses , 2011 .

[10]  Ryan Cotterell,et al.  Morphological Irregularity Correlates with Frequency , 2019, ACL.

[11]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[12]  Ryan Cotterell,et al.  Exact Hard Monotonic Attention for Character-Level Transduction , 2019, ACL.

[13]  Christo Kirov,et al.  Very-large Scale Parsing and Normalization of Wiktionary Morphological Paradigms , 2016, LREC.

[14]  Antonio Valerio Miceli Barone Towards cross-lingual distributed representations without parallel text trained with adversarial autoencoders , 2016, Rep4NLP@ACL.

[15]  Dong Wang,et al.  Normalized Word Embedding and Orthogonal Transform for Bilingual Word Translation , 2015, NAACL.

[16]  Ndapandula Nakashole,et al.  Characterizing Departures from Linearity in Word Translation , 2018, ACL.

[17]  L. Feldman Diachronic and Typological Properties of Morphology and Their Implications for Representation , 2013 .

[18]  Philipp Slusallek,et al.  Introduction to real-time ray tracing , 2005, SIGGRAPH Courses.

[19]  Philipp Koehn,et al.  Learning a Translation Lexicon from Monolingual Corpora , 2002, ACL 2002.

[20]  Joan L. Bybee Diachronic and typological properties of morphology and their implications for representation , 1995 .

[21]  Joan L. Bybee,et al.  Regular morphology and the lexicon. , 1995 .

[22]  Ryan Cotterell,et al.  The SIGMORPHON 2016 Shared Task—Morphological Reinflection , 2016, SIGMORPHON.

[23]  J. Hay Lexical frequency in morphology: Is everything relative? , 2001 .

[24]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[25]  Ryan Cotterell,et al.  CoNLL-SIGMORPHON 2017 Shared Task: Universal Morphological Reinflection in 52 Languages , 2017, CoNLL.

[26]  Graham Neubig,et al.  Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces , 2019, ACL.

[27]  Ryan Cotterell,et al.  The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection , 2019, Proceedings of the 16th Workshop on Computational Research in Phonetics, Phonology, and Morphology.

[28]  Eneko Agirre,et al.  Generalizing and Improving Bilingual Word Embedding Mappings with a Multi-Step Framework of Linear Transformations , 2018, AAAI.

[29]  Peng Chen,et al.  MAAM: A Morphology-Aware Alignment Model for Unsupervised Bilingual Lexicon Induction , 2019, ACL.

[30]  Quoc V. Le,et al.  Exploiting Similarities among Languages for Machine Translation , 2013, ArXiv.

[31]  Anders Søgaard,et al.  On the Limitations of Unsupervised Bilingual Dictionary Induction , 2018, ACL.

[32]  Steven Pinker,et al.  Generalisation of regular and irregular morphological patterns , 1993 .

[33]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[34]  Ryan Cotterell,et al.  Don’t Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction , 2019, EMNLP.

[35]  Francis Bond,et al.  Linking and Extending an Open Multilingual Wordnet , 2013, ACL.

[36]  Goran Glavas,et al.  How to (Properly) Evaluate Cross-Lingual Word Embeddings: On Strong Baselines, Comparative Analyses, and Some Misconceptions , 2019, ACL.

[37]  Joan L. Bybee Morphology: A study of the relation between meaning and form , 1985 .

[38]  R. Jackendoff Morphological and semantic regularities in the lexicon , 1975 .

[39]  Ryan Cotterell,et al.  A Discriminative Latent-Variable Model for Bilingual Lexicon Induction , 2018, EMNLP.

[40]  Pascale Fung,et al.  Translating Unknown Words Using Nonparallel, Comparable Texts , 1998, Annual Meeting of the Association for Computational Linguistics.

[41]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[42]  Anders Søgaard,et al.  A Survey of Cross-lingual Word Embedding Models , 2017, J. Artif. Intell. Res..

[43]  Pascale Fung,et al.  A Statistical View on Bilingual Lexicon Extraction: From Parallel Corpora to Non-parallel Corpora , 1998, AMTA.

[44]  Meng Zhang,et al.  Adversarial Training for Unsupervised Bilingual Lexicon Induction , 2017, ACL.

[45]  Ryan Cotterell,et al.  UniMorph 3.0: Universal Morphology , 2018, LREC.

[46]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[47]  Julian Kupiec,et al.  An Algorithm for Finding Noun Phrase Correspondences in Bilingual Corpora , 1993, ACL.

[48]  Eneko Agirre,et al.  Learning principled bilingual mappings of word embeddings while preserving monolingual invariance , 2016, EMNLP.

[49]  Georgiana Dinu,et al.  Improving zero-shot learning by mitigating the hubness problem , 2014, ICLR.

[50]  Daniel Gildea,et al.  Orthographic Features for Bilingual Lexicon Induction , 2018, ACL.

[51]  Eneko Agirre,et al.  A robust self-learning method for fully unsupervised cross-lingual mappings of word embeddings , 2018, ACL.

[52]  Reinhard Rapp,et al.  Identifying Word Translations in Non-Parallel Texts , 1995, ACL.

[53]  R. Baayen,et al.  On frequency, transparency and productivity , 1993 .

[54]  Eneko Agirre,et al.  Learning bilingual word embeddings with (almost) no bilingual data , 2017, ACL.

[55]  Anders Søgaard,et al.  Lost in Evaluation: Misleading Benchmarks for Bilingual Dictionary Induction , 2019, EMNLP/IJCNLP.