BBN’s low-resource machine translation for the LoReHLT 2016 evaluation

We describe BBN’s contribution to the machine translation (MT) task in the LoReHLT 2016 evaluation, focusing on the techniques and methodologies employed to build the Uyghur–English MT systems in low-resource conditions. In particular, we discuss the data selection process, morphological segmentation of the source, neural network feature models, and our use of a native informant and related language resources. Our final submission for the evaluation was ranked first among all participants.

[1]  Jan Niehues,et al.  Toward Multilingual Neural Machine Translation with Universal Encoder and Decoder , 2016, IWSLT.

[2]  Jinxi Xu,et al.  String-to-Dependency Statistical Machine Translation , 2010, CL.

[3]  Martin Wattenberg,et al.  Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation , 2016, TACL.

[4]  Richard M. Schwartz,et al.  Statistical Machine Translation Features with Multitask Tensor Networks , 2015, ACL.

[5]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[6]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[7]  George Kurian,et al.  Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation , 2016, ArXiv.

[8]  Mikko Kurimo,et al.  Morfessor FlatCat: An HMM-Based Method for Unsupervised and Semi-Supervised Learning of Morphology , 2014, COLING.

[9]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[10]  Richard M. Schwartz,et al.  BBN System Description for WMT10 System Combination Task , 2010, WMT@ACL.

[11]  Regina Barzilay,et al.  Unsupervised Morphology Rivals Supervised Morphology for Arabic MT , 2012, ACL.

[12]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[13]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[14]  M. Rey,et al.  11 , 001 New Features for Statistical Machine Translation , 2009 .

[15]  Spyridon Matsoukas,et al.  Trait-Based Hypothesis Selection For Machine Translation , 2012, HLT-NAACL.

[16]  Rico Sennrich,et al.  Edinburgh Neural Machine Translation Systems for WMT 16 , 2016, WMT.

[17]  Richard M. Schwartz,et al.  Fast and Robust Neural Network Joint Models for Statistical Machine Translation , 2014, ACL.

[18]  John DeNero,et al.  Better Word Alignments with Supervised ITG Models , 2009, ACL.

[19]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[20]  Jacob Devlin,et al.  Lexical Features for Statistical Machine Translation , 2009 .

[21]  Charles Yang,et al.  Unsupervised Morphology Learning with Statistical Paradigms , 2018, COLING.

[22]  Richard M. Schwartz,et al.  Combining Outputs from Multiple Machine Translation Systems , 2007, NAACL.

[23]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.