The Alignment Template Approach to Statistical Machine Translation

A phrase-based statistical machine translation approach the alignment template approach is described. This translation approach allows for general many-to-many relations between words. Thereby, the context of words is taken into account in the translation model, and local changes in word order from source to target language can be learned explicitly. The model is described using a log-linear modeling approach, which is a generalization of the often used source-channel approach. Thereby, the model is easier to extend than classical statistical machine translation systems. We describe in detail the process for learning phrasal translations, the feature functions used, and the search algorithm. The evaluation of this approach is performed on three different tasks. For the German-English speech Verbmobil task, we analyze the effect of various system components. On the French-English Canadian Hansards task, the alignment template system obtains significantly better results than a single-word-based translation model. In the Chinese-English 2002 National Institute of Standards and Technology (NIST) machine translation evaluation it yields statistically significantly better NIST scores than all competing research and commercial translation systems.

[1]  J. Darroch,et al.  Generalized Iterative Scaling for Log-Linear Models , 1972 .

[2]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[3]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[4]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[5]  John D. Lafferty,et al.  The Candide System for Machine Translation , 1994, HLT.

[6]  Hermann Ney,et al.  Improvements in beam search , 1994, ICSLP.

[7]  Hermann Ney,et al.  On the Probabilistic Interpretation of Neural Network Classifiers and Discriminative Training Criteria , 1995, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Hermann Ney,et al.  Extensions of absolute discounting for language modeling , 1995, EUROSPEECH.

[9]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[10]  Adam L. Berger,et al.  A Maximum Entropy Approach to Natural Language Processing , 1996, CL.

[11]  Alexander H. Waibel,et al.  Decoding Algorithm in Statistical Machine Translation , 1997, ACL.

[12]  Salim Roukos,et al.  Feature-based language understanding , 1997, EUROSPEECH.

[13]  Hermann Ney,et al.  A DP based Search Using Monotone Alignments in Statistical Translation , 1997, ACL.

[14]  Hermann Ney,et al.  An iterative, DP-based search algorithm for statistical machine translation , 1998, ICSLP.

[15]  Salim Roukos,et al.  Maximum likelihood and discriminative training of direct translation models , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[16]  Dekai Wu,et al.  Machine Translation with a Stochastic Grammatical Channel , 1998, COLING-ACL.

[17]  Hermann Ney,et al.  A DP based Search Algorithm for Statistical Machine Translation , 1998, ACL.

[18]  Kevin Knight,et al.  Decoding Complexity in Word-Replacement Translation Models , 1999, Comput. Linguistics.

[19]  Franz Josef Och,et al.  An Efficient Method for Determining Bilingual Word Classes , 1999, EACL.

[20]  Hermann Ney,et al.  An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[21]  Walther von Hahn,et al.  Functional Validation of a Machine Interpretation System: Verbmobil , 2000 .

[22]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[23]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[24]  Hermann Ney,et al.  Word Re-ordering and DP-based Search in Statistical Machine Translation , 2000, COLING.

[25]  Kevin Knight,et al.  A Syntax-based Statistical Translation Model , 2001, ACL.

[26]  Hermann Ney,et al.  An Efficient A* Search Algorithm for Statistical Machine Translation , 2001, DDMMT@ACL.

[27]  Hermann Ney,et al.  The RWTH System for Statistical Translation of Spoken Dialogues , 2001, HLT.

[28]  Ye-Yi Wang,et al.  Grammar Inference and Statistical Machine Translation , 2001 .

[29]  Hermann Ney,et al.  Refined Lexicon Models for Statistical Machine Translation using a Maximum Entropy Approach , 2001, ACL.

[30]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[31]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[32]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[33]  Christoph Tillmann,et al.  Word re-ordering and dynamic programming based search algorithm for statistical machine translation , 2002 .

[34]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[35]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[36]  Daniel Gildea,et al.  Loosely Tree-Based Alignment for Machine Translation , 2003, ACL.

[37]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[38]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[39]  Christoph Tillmann,et al.  A Projection Extension Algorithm for Statistical Machine Translation , 2003, EMNLP.

[40]  Alexander H. Waibel,et al.  Effective Phrase Translation Extraction from Alignment Models , 2003, ACL.

[41]  Hermann Ney,et al.  Word Reordering and a Dynamic Programming Beam Search Algorithm for Statistical Machine Translation , 2003, CL.

[42]  Jianqiang Wang,et al.  Matching Meaning for Cross-Language Information Retrieval , 2012, Inf. Process. Manag..

[43]  Francisco Casacuberta,et al.  Learning finite-state models for machine translation , 2004, Machine Learning.