Statistical Approaches to Computer-Assisted Translation

Current machine translation (MT) systems are still not perfect. In practice, the output from these systems needs to be edited to correct errors. A way of increasing the productivity of the whole translation process (MT plus human work) is to incorporate the human correction activities within the translation process itself, thereby shifting the MT paradigm to that of computer-assisted translation. This model entails an iterative process in which the human translator activity is included in the loop: In each iteration, a prefix of the translation is validated (accepted or amended) by the human and the system computes its best (or n-best) translation suffix hypothesis to complete this prefix. A successful framework for MT is the so-called statistical (or pattern recognition) framework. Interestingly, within this framework, the adaptation of MT systems to the interactive scenario affects mainly the search process, allowing a great reuse of successful techniques and models. In this article, alignment templates, phrase-based models, and stochastic finite-state transducers are used to develop computer-assisted translation systems. These systems were assessed in a European project (TransType2) in two real tasks: The translation of printer manuals; manuals and the translation of the Bulletin of the European Union. In each task, the following three pairs of languages were involved (in both translation directions): English-Spanish, English-German, and English-French.

[1]  Francisco Casacuberta,et al.  Learning Finite-State Models for Machine Translation , 2004, ICGI.

[2]  Francisco Casacuberta,et al.  MONOTONE STATISTICAL TRANSLATION USING WORD GROUPS , 2001 .

[3]  Hermann Ney,et al.  Efficient Search for Interactive Statistical Machine Translation , 2003, EACL.

[4]  Francisco Casacuberta,et al.  The EuTrans Spoken Language Translation System , 2004, Machine Translation.

[5]  Francisco Casacuberta,et al.  Machine Translation with Inferred Stochastic Finite-State Transducers , 2004, Computational Linguistics.

[6]  Sergei Nirenburg,et al.  The Proper Place of Men and Machines in Language Translation , 2003 .

[7]  Ying Zhang,et al.  Measuring confidence intervals for the machine translation evaluation metrics , 2004, TMI.

[8]  William H. Press,et al.  Numerical recipes in C. The art of scientific computing , 1987 .

[9]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[10]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[11]  Hermann Ney,et al.  Phrase-Based Statistical Machine Translation , 2002, KI.

[12]  MARTIN KAY The Proper Place of Men and Machines in Language Translation , 2004, Machine Translation.

[13]  Jonathan Yamron,et al.  LINGSTAT: an Interactive, Machine-Aided Translation System , 1993, HLT.

[14]  Jonathan Slocum,et al.  A Survey of Machine Translation: Its History, Current Status and Future Prospects , 1985, CL.

[15]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[16]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[17]  Hermann Ney,et al.  Integration of Speech to Computer-Assisted Translation Using Finite-State Automata , 2006, ACL.

[18]  Pius ten Hacken Computers and translation: a translator's guide , 2004 .

[19]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[20]  George F. Foster,et al.  TransType: a Computer-Aided Translation Typing System , 2000 .

[21]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[22]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[23]  Hermann Ney,et al.  Algorithms for statistical translation of spoken language , 2000, IEEE Trans. Speech Audio Process..

[24]  Enrique Vidal,et al.  Finite-state speech-to-speech translation , 1997, 1997 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[25]  Francisco Casacuberta,et al.  Finite-State Models for Computer Assisted Translation , 2004, ECAI.

[26]  Francisco Casacuberta,et al.  A Syntactic Pattern Recognition Approach to Computer Assisted Translation , 2004, SSPR/SPR.

[27]  Pierre Isabelle,et al.  Target-Text Mediated Interactive Machine Translation , 2004, Machine Translation.

[28]  Francisco Casacuberta,et al.  Some Statistical-Estimation Methods for Stochastic Finite-State Transducers , 2004, Machine Learning.

[29]  Masaru Tomita Feasibility study of personal interactive machine translation systems , 1985 .

[30]  Hermann Ney,et al.  Bootstrap estimates for confidence intervals in ASR performance evaluation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[31]  Hermann Ney,et al.  Comparison of generation strategies for interactive machine translation , 2005, EAMT.

[32]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[33]  Francisco Casacuberta,et al.  Probabilistic finite-state machines - part II , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34]  Francisco Casacuberta,et al.  PATTERN RECOGNITION APPROACHES FOR SPEECH-TO-SPEECH TRANSLATION , 2004, Cybern. Syst..

[35]  Hermann Ney,et al.  Automatic text dictation in computer-assisted translation , 2005, INTERSPEECH.

[36]  Francisco Casacuberta,et al.  Computer-assisted translation using speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[37]  Andrés Marzal,et al.  Computing the K Shortest Paths: A New Algorithm and an Experimental Comparison , 1999, WAE.

[38]  Francisco Casacuberta,et al.  Statistical Machine Translation Decoding Using Target Word Reordering , 2004, SSPR/SPR.

[39]  Philippe Langlais,et al.  Trans Type: Development-Evaluation Cycles to Boost Translator's Productivity , 2002, Machine Translation.

[40]  Rémi Zajac,et al.  Interactive Translation: a new approach , 1988, COLING.

[41]  Francisco Casacuberta,et al.  Inference of finite-state transducers from regular languages , 2005, Pattern Recognit..

[42]  Pete Whitelock,et al.  Strategies for Interactive Machine Translation: the experience and implications of the UMIST Japanese project , 1986, COLING.

[43]  Elliott Macklovitch TransType2 : The Last Word , 2006, LREC.

[44]  Francisco Casacuberta,et al.  From Machine Translation to Computer Assisted Translation using Finite-State Models , 2004, EMNLP.

[45]  Guy Lapalme,et al.  Text prediction for translators , 2002 .

[46]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[47]  Hermann Ney,et al.  Generation of Word Graphs in Statistical Machine Translation , 2002, EMNLP.

[48]  Jean Berstel,et al.  Transductions and context-free languages , 1979, Teubner Studienbücher : Informatik.

[49]  Enrique Vidal,et al.  Efficient Error-Correcting Viterbi Parsing , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[50]  Francisco Casacuberta,et al.  Adapting finite-state translation to the TransType2 project , 2003, EAMT.

[51]  Francisco Casacuberta,et al.  Combining Phrase-Based and Template-Based Alignment Models in Statistical Translation , 2003, IbPRIA.

[52]  Francisco Casacuberta,et al.  Statistical Phrase-Based Models for Interactive Computer-Assisted Translation , 2006, ACL.

[53]  Philipp Koehn,et al.  A process study of computer-aided translation , 2009, Machine Translation.

[54]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[55]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[56]  Hermann Ney,et al.  Improvements in Phrase-Based Statistical Machine Translation , 2004, NAACL.

[57]  Francisco Casacuberta,et al.  Learning finite-state models for machine translation , 2004, Machine Learning.

[58]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[59]  Lynne Bowker,et al.  Computer-Aided Translation Technology: A Practical Introduction , 2002 .

[60]  Hermann Ney,et al.  Some approaches to statistical and finite-state speech-to-speech translation , 2004, Comput. Speech Lang..

[61]  Huang Heyan,et al.  Interactive approach in machine translation systems , 1997, 1997 IEEE International Conference on Intelligent Processing Systems (Cat. No.97TH8335).