One-Shot Neural Cross-Lingual Transfer for Paradigm Completion

We present a novel cross-lingual transfer method for paradigm completion, the task of mapping a lemma to its inflected forms, using a neural encoder-decoder model, the state of the art for the monolingual task. We use labeled data from a high-resource language to increase performance on a low-resource language. In experiments on 21 language pairs from four different language families, we obtain up to 58% higher accuracy than without transfer and show that even zero-shot and one-shot learning are possible. We further find that the degree of language relatedness strongly influences the ability to transfer morphological knowledge.

[1]  Ryan Cotterell,et al.  Morphological Smoothing and Extrapolation of Word Embeddings , 2016, ACL.

[2]  Matthew D. Zeiler ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.

[3]  Markus Forsberg,et al.  Semi-supervised learning of morphological paradigms and lexicons , 2014, EACL.

[4]  Yulia Tsvetkov,et al.  Morphological Inflection Generation Using Character Sequence to Sequence Learning , 2015, NAACL.

[5]  Yoshua Bengio,et al.  Multi-Way, Multilingual Neural Machine Translation with a Shared Attention Mechanism , 2016, NAACL.

[6]  Ryan Cotterell,et al.  The SIGMORPHON 2016 Shared Task—Morphological Reinflection , 2016, SIGMORPHON.

[7]  Christopher D. Manning,et al.  Cross-lingual Pseudo-Projected Expectation Regularization for Weakly Supervised Learning , 2013, ArXiv.

[8]  Douglas W. Oard,et al.  Cross-language text classification , 2005, SIGIR '05.

[9]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[10]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[12]  John Sylak-Glassman The Composition and Use of the Universal Morphological Feature Schema (UniMorph Schema) , 2016 .

[13]  Chenhui Chu,et al.  An Empirical Comparison of Domain Adaptation Methods for Neural Machine Translation , 2017, ACL.

[14]  Robert Hetzron,et al.  Semitic Languages , 1954, PMLA/Publications of the Modern Language Association of America.

[15]  Jan A. Botha,et al.  Cross-Lingual Morphological Tagging for Low-Resource Languages , 2016, ACL.

[16]  Christo Kirov,et al.  Very-large Scale Parsing and Normalization of Wiktionary Morphological Paradigms , 2016, LREC.

[17]  Yifan Gong,et al.  Cross-language knowledge transfer using multilingual deep neural network with shared hidden layers , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[19]  Grzegorz Kondrak,et al.  Inflection Generation as Discriminative String Transduction , 2015, HLT-NAACL.

[20]  Ebru Arisoy,et al.  Analysis of Morph-Based Speech Recognition and the Modeling of Out-of-Vocabulary Words Across Languages , 2007, HLT-NAACL.

[21]  R. Valijärvi,et al.  The Uralic Languages , 1998 .

[22]  Regina Barzilay,et al.  Cross-lingual Propagation for Morphological Analysis , 2008, AAAI.

[23]  Kevin Knight,et al.  Multi-Source Neural Translation , 2016, NAACL.

[24]  Yoshua Bengio,et al.  On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[25]  John Shawe-Taylor,et al.  The use of machine translation tools for cross-lingual text mining , 2005 .

[26]  Ryan Cotterell,et al.  Neural Morphological Analysis: Encoding-Decoding Canonical Segments , 2016, EMNLP.

[27]  Regina Barzilay,et al.  Selective Sharing for Multilingual Dependency Parsing , 2012, ACL.

[28]  David Yarowsky,et al.  Inducing Multilingual Text Analysis Tools via Robust Projection across Aligned Corpora , 2001, HLT.

[29]  Regina Barzilay,et al.  Unsupervised Multilingual Learning for Morphological Segmentation , 2008, ACL.

[30]  Lei Shi,et al.  Cross Language Text Classification by Model Translation and Semi-Supervised Learning , 2010, EMNLP.

[31]  Geoffrey E. Hinton,et al.  A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[32]  Rich Caruana,et al.  Multitask Learning , 1997, Machine-mediated learning.

[33]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[34]  Regina Barzilay,et al.  Morphological Segmentation for Keyword Spotting , 2014, EMNLP.

[35]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[36]  Nigel Vincent,et al.  The Romance Languages , 1988 .

[37]  Anders Søgaard Data point selection for cross-language adaptation of dependency parsers , 2011, ACL.

[38]  Ryan Cotterell,et al.  Modeling Word Forms Using Latent Underlying Morphs and Phonology , 2015, TACL.

[39]  Christo Kirov,et al.  A Language-Independent Feature Schema for Inflectional Morphology , 2015, ACL.

[40]  Yonatan Belinkov,et al.  Improving Sequence to Sequence Learning for Morphological Inflection Generation: The BIU-MIT Systems for the SIGMORPHON 2016 Shared Task for Morphological Reinflection , 2016, SIGMORPHON.

[41]  Mirella Lapata,et al.  Cross-linguistic Projection of Role-Semantic Information , 2005, HLT/EMNLP.

[42]  Greville G. Corbett,et al.  The Slavonic Languages , 1993 .

[43]  William W. Cohen,et al.  Semi-Markov Conditional Random Fields for Information Extraction , 2004, NIPS.

[44]  Hinrich Schütze,et al.  Corpus-level Fine-grained Entity Typing Using Contextual Information , 2015, EMNLP.

[45]  John DeNero,et al.  Supervised Learning of Complete Morphological Paradigms , 2013, NAACL.

[46]  Noah A. Smith,et al.  Unsupervised Structure Prediction with Non-Parallel Multilingual Guidance , 2011, EMNLP.

[47]  Guillaume Lample,et al.  Polyglot Neural Language Models: A Case Study in Cross-Lingual Phonetic Representation Learning , 2016, NAACL.

[48]  Smaranda Muresan,et al.  Generalizing Word Lattice Translation , 2008, ACL.

[49]  Katharina Kann,et al.  MED: The LMU System for the SIGMORPHON 2016 Shared Task on Morphological Reinflection , 2016, SIGMORPHON.

[50]  Geoffrey E. Hinton,et al.  Grammar as a Foreign Language , 2014, NIPS.

[51]  Katharina Kann,et al.  Single-Model Encoder-Decoder with Explicit Morphological Representation for Reinflection , 2016, ACL.

[52]  Wolfgang Seeker,et al.  A Graph-based Lattice Dependency Parser for Joint Morphological Segmentation and Syntactic Analysis , 2015, Transactions of the Association for Computational Linguistics.

[53]  Christopher D. Manning,et al.  Cross-lingual Projected Expectation Regularization for Weakly Supervised Learning , 2014, TACL.

[54]  P. Lewis Ethnologue : languages of the world , 2009 .

[55]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[56]  Noah A. Smith,et al.  Many Languages, One Parser , 2016, TACL.

[57]  Dianhai Yu,et al.  Multi-Task Learning for Multiple Language Translation , 2015, ACL.

[58]  Slav Petrov,et al.  A Universal Part-of-Speech Tagset , 2011, LREC.

[59]  Jan Niehues,et al.  Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder , 2016, IWSLT.

[60]  Chenhui Chu,et al.  An Empirical Comparison of Simple Domain Adaptation Methods for Neural Machine Translation , 2017, ArXiv.

[61]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.