A Feasibility Study for Chinese-Spanish Statistical Machine Translation

This article presents and describes an experimental prototype system for performing Chinese-to-Spanish and Spanish-to-Chinese machine translation. The system is based on the statistical machine translation (SMT) framework and, more specifically, it implements the bilingual n-gram SMT approach. Since, as far as we know, no large Chinese-Spanish parallel corpus is currently available for training purposes, an alternative experimental method for building a training corpus was used. This method is compared, in terms of translation quality, to the simpler approach of using English as a bridge language for performing Chinese-to-Spanish and Spanish-to-Chinese translations.

[1]  W. N. Locke,et al.  Machine Translation of Languages: Fourteen Essays , 1955 .

[2]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[3]  Peter Norvig,et al.  Verbmobih A Translation System for Face-to-Face Dialog , 1994 .

[4]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[5]  Roberto Pieraccini,et al.  Stochastic automata for language modeling , 1996, Comput. Speech Lang..

[6]  Enrique Vidal,et al.  Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[7]  Yaser Al-Onaizan,et al.  Translation with Finite-State Devices , 1998, AMTA.

[8]  Francisco Casacuberta Finite-state transducers for speech-input translation , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[9]  José B. Mariño,et al.  Using x-grams for speech-to-speech translation , 2002, INTERSPEECH.

[10]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11]  Francisco Casacuberta,et al.  Machine Translation with Inferred Stochastic Finite-State Transducers , 2004, Computational Linguistics.

[12]  Srinivas Bangalore,et al.  Stochastic Finite-State Models for Spoken Language Machine Translation , 2000, Machine Translation.

[13]  José B. Mariño,et al.  Finite-state-based and phrase-based statistical machine translation , 2004, INTERSPEECH.

[14]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[15]  José B. Mariño,et al.  An n-gram-based statistical machine translation decoder , 2005, INTERSPEECH.

[16]  José B. Mariño,et al.  Bilingual N-gram Statistical Machine Translation , 2005 .