Neural Network Language Models for Translation with Limited Data

In this paper we present how to estimate a continuous space language model with a neural network to be used in a statistical machine translation system. We report results for an Italian-English translation task obtained on a small corpus (about 150 K tokens), that can be considered a task with a lack of training data. Different word history length included in the connectionist language model (n-gram order) and distinct continuous space representation (i.e. words appearing in the training corpus more than k times) are considered in the study. The experimental results are evaluated by means of automatic evaluation metrics correlated with fluency and adequacy of the generated translations.

[1]  José B. Mariño,et al.  Finite-state-based and phrase-based statistical machine translation , 2004, INTERSPEECH.

[2]  Masaaki Nagata,et al.  NUT-NTT Statistical Machine Translation System for IWSLT 2005 , 2005, IWSLT.

[3]  Cyril Goutte Automatic Evaluation of Machine Translation Quality , 2006 .

[4]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[5]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[6]  Peng Xu,et al.  Random Forests in Language Modelin , 2004, EMNLP.

[7]  Maxim Khalilov,et al.  LANGUAGE MODELING FOR VERBATIM TRANSLATION TASK , 2006 .

[8]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[9]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[10]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[11]  Chris Quirk,et al.  Microsoft Research Treelet Translation System: IWSLT Evaluation , 2005, IWSLT.

[12]  Eiichiro Sumita,et al.  Toward a Broad-coverage Bilingual Corpus for Speech Translation of Travel Conversations in the Real World , 2002, LREC.

[13]  Alex Waibel,et al.  The CMU statistical machine translation system , 2003, MTSUMMIT.

[14]  Mei Yang,et al.  Improved Language Modeling for Statistical Machine Translation , 2005, ParallelText@ACL.

[15]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[16]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[17]  Salvador España Boquera,et al.  Efficient BP Algorithms for General Feedforward Neural Networks , 2007, IWINAC.

[18]  José B. Mariño,et al.  The TALP ngram-based SMT system for IWSLT'05 , 2005, IWSLT.

[19]  Michael Paul,et al.  Overview of the IWSLT06 evaluation campaign , 2006, IWSLT.

[20]  Ian R. Lane,et al.  The UKA/CMU statistical machine translation system for IWSLT 2006 , 2006, IWSLT.

[21]  José B. Mariño,et al.  Using x-grams for speech-to-speech translation , 2002, INTERSPEECH.

[22]  Yoshua Bengio,et al.  A Neural Probabilistic Language Model , 2003, J. Mach. Learn. Res..

[23]  Mauro Cettolo,et al.  The ITC-irst SMT system for IWSLT 2006 , 2006, IWSLT.

[24]  José B. Mariño,et al.  Improving statistical MT by coupling reordering and decoding , 2006, Machine Translation.

[25]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[26]  Robert L. Mercer,et al.  A Statistical Approach to Sense Disambiguation in Machine Translation , 1991, HLT.

[27]  Holger Schwenk,et al.  Continuous Space Language Models for Statistical Machine Translation , 2006, ACL.

[28]  Francisco Casacuberta,et al.  Architectures for Speech-to-Speech Translation Using Finite-state Models , 2002, Speech-to-Speech Translation@ACL.

[29]  María José Castro Bleda,et al.  New Directions in Connectionist Language Modeling , 2003, IWANN.

[30]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[31]  Peng Xu,et al.  Random forests and the data sparseness problem in language modeling , 2007, Comput. Speech Lang..

[32]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[33]  Taro Watanabe,et al.  The NTT statistical machine translation system for IWSLT2005 , 2005, IWSLT.

[34]  José B. Mariño,et al.  An n-gram-based statistical machine translation decoder , 2005, INTERSPEECH.

[35]  Kenji Yamada,et al.  Syntax-based language models for statistical machine translation , 2003, ACL 2003.

[36]  José B. Mariño,et al.  The TALP Ngram-based SMT System for IWSLT 2006 , 2006 .

[37]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..