A Critique of Statistical Machine Translation

Phrase-Based Statistical Machine Translation (PB-SMT) is clearly the leading paradigm in the field today. Nevertheless—and this may come as some surprise to the PB-SMT community—most translators and, somewhat more surprisingly perhaps, many experienced MT protagonists find the basic model extremely difficult to understand. The main aim of this paper, therefore, is to discuss why this might be the case. Our basic thesis is that proponents of PB-SMT do not seek to address any community other than their own, for they do not feel any need to do so. We demonstrate that this was not always the case; on the contrary, when statistical models of trans-lation were first presented, the language used to describe how such a model might work was very conciliatory, and inclusive. Over the next five years, things changed considerably; once SMT achieved dominance particularly over the rule-based paradigm, it had established a position where it did not need to bring along the rest of the MT community with it, and in our view, this has largely pertained to this day. Having discussed these issues, we discuss three additional issues: the role of automatic MT evaluation metrics when describing PB-SMT systems; the recent syntactic embellishments of PB-SMT, noting especially that most of these contributions have come from researchers who have prior experience in fields other than statistical models of translation; and the relationship between PB-SMT and other models of translation, suggesting that there are many gains to be had if the SMT community were to open up more to the other MT paradigms.

[1]  Yanjun Ma,et al.  MaTrEx: the DCU machine translation system for IWSLT 2007 , 2007, IWSLT.

[2]  Arul Menezes,et al.  A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora , 2001, DDMMT@ACL.

[3]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[4]  Nizar Habash,et al.  Symbolic-to-statistical hybridization: extending generation-heavy machine translation , 2009, Machine Translation.

[5]  Alan K. Melby,et al.  The Possibility of Language: A Discussion of the Nature of Language , 1995 .

[6]  W. J. Hutchins Machine Translation: Past, Present, Future , 1986 .

[7]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[8]  Andy Way,et al.  wEBMT: Developing and Validating an Example-Based Machine Translation System using the World Wide Web , 2003, CL.

[9]  Mark Steedman,et al.  The syntactic process , 2004, Language, speech, and communication.

[10]  Franz Josef Och,et al.  A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT , 2008, COLING.

[11]  Hiroshi Maruyama,et al.  Tree Cover Search Algorithm for Example-Based Translation , 2005 .

[12]  M. T. Rosetta A compositional definition of the translation relation , 1994 .

[13]  Sergei Nirenburg,et al.  A Statistical Approach to Machine Translation , 2003 .

[14]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[15]  Philipp Koehn,et al.  Further Meta-Evaluation of Machine Translation , 2008, WMT@ACL.

[16]  Ding Liu,et al.  Syntactic Features for Evaluation of Machine Translation , 2005, IEEvaluation@ACL.

[17]  Cameron S. Fordyce,et al.  Overview of the IWSLT 2007 evaluation campaign , 2007, IWSLT.

[18]  Michael Paul,et al.  Overview of the IWSLT06 evaluation campaign , 2006, IWSLT.

[19]  Nizar Habash,et al.  DUSTer: a method for unraveling cross-language divergences for statistical word-level alignment , 2002, AMTA.

[20]  Yifan He,et al.  Improving the Objective Function in Minimum Error Rate Training , 2009, MTSUMMIT.

[21]  P. Alam ‘E’ , 2021, Composites Engineering: An A–Z Guide.

[22]  Andy Way,et al.  Disambiguation Strategies for Data-Oriented Translation , 2006, EAMT.

[23]  Bernard Vauquois,et al.  A survey of formal grammars and algorithms for recognition and transformation in mechanical translation , 1968, IFIP Congress.

[24]  Hitoshi Iida,et al.  Experiments and Prospects of Example-Based Machine Translation , 1991, ACL.

[25]  Michael Paul,et al.  Overview of the IWSLT 2008 evaluation campaign. , 2008, IWSLT.

[26]  Yaser Al-Onaizan,et al.  Generalizing Local and Non-Local Word-Reordering Patterns for Syntax-Based Machine Translation , 2008, EMNLP.

[27]  Danna Zhou,et al.  d. , 1840, Microbial pathogenesis.

[28]  James R. Curran,et al.  The Importance of Supertagging for Wide-Coverage CCG Parsing , 2004, COLING.

[29]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[30]  Srinivas Bangalore,et al.  Supertagging: An Approach to Almost Parsing , 1999, CL.

[31]  Andy Way,et al.  Evaluating machine translation with LFG dependencies , 2007, Machine Translation.

[32]  Andy Way,et al.  Capturing translational divergences with a statistical tree-to-tree aligner , 2007 .

[33]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[34]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[35]  Hideo Watanabe,et al.  A Similarity-Driven Transfer System , 1992, COLING.

[36]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[37]  Andy Way,et al.  Statistical Machine Translation: A Guide for Linguists and Translators , 2011, Lang. Linguistics Compass.

[38]  Joseph P. Turian,et al.  Evaluation of machine translation and its evaluation , 2003, MTSUMMIT.

[39]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[40]  Sergei Nirenburg,et al.  Two Approaches to Matching in Example-Based Machine Translation , 1993, TMI.

[41]  Tsuyoshi Murata,et al.  {m , 1934, ACML.

[42]  John Lehrberger,et al.  Machine Translation: Linguistic characteristics of MT systems and general methodology of evaluation , 1988 .

[43]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[44]  P. Alam ‘W’ , 2021, Composites Engineering.

[45]  Eiichiro Sumita,et al.  A Translation Aid System Using Flexible Text Retrieval Based on Syntax-Matching , 1988 .

[46]  Mary Hearne,et al.  Data-oriented models of parsing and translation , 2005 .

[47]  Ralf D. Brown,et al.  Adding linguistic knowledge to a lexical example-based translation system , 1999, TMI.

[48]  Andy Way,et al.  Panning for EBMT gold, or “Remembering not to forget” , 2010, Machine Translation.

[49]  Andreas Zollmann,et al.  Syntax Augmented Machine Translation via Chart Parsing , 2006, WMT@HLT-NAACL.

[50]  Miss A.O. Penney (b) , 1974, The New Yale Book of Quotations.

[51]  Andy Way,et al.  Syntactically Lexicalized Phrase-Based SMT , 2008, IEEE Transactions on Audio, Speech, and Language Processing.

[52]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[53]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[54]  John D. Lafferty,et al.  Analysis, statistical transfer, and synthesis in machine translation , 1992, TMI.

[55]  Daniel Marcu,et al.  SPMT: Statistical Machine Translation with Syntactified Target Language Phrases , 2006, EMNLP.

[56]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[57]  Alan K. Melby,et al.  The possibility of language , 1995 .

[58]  P. Alam ‘S’ , 2021, Composites Engineering: An A–Z Guide.

[59]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[60]  Jonathan Slocum,et al.  The LRC Machine Translation System , 1985, Comput. Linguistics.

[61]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[62]  Philipp Koehn,et al.  Findings of the 2009 Workshop on Statistical Machine Translation , 2009, WMT@EACL.

[63]  S. D. Pietra,et al.  A statistical approach to French/English translation , 1988, TMI.

[64]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[65]  Andy Way,et al.  Optimal Bilingual Data for French-English PB-SMT , 2009, EAMT.

[66]  Jaime G. Carbonell,et al.  The KANT perspective: a critique of pure transfer (and pure interlingua, pure statistics, .. ) , 1992, TMI.

[67]  F. Sánchez-Martínez Using unsupervised corpus-based methods to build rule-based machine translation systems , 2011 .

[68]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[69]  H. Altay Güvenir,et al.  Learning Translation Templates from Bilingual Translation Examples , 2004, Applied Intelligence.

[70]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[71]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[72]  Philipp Koehn,et al.  (Meta-) Evaluation of Machine Translation , 2007, WMT@ACL.

[73]  Philipp Koehn,et al.  Pharaoh: A Beam Search Decoder for Phrase-Based Statistical Machine Translation Models , 2004, AMTA.

[74]  Thomas de Quincey [C] , 2000, The Works of Thomas De Quincey, Vol. 1: Writings, 1799–1820.

[75]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[76]  Andy Way,et al.  On the Role of Translations in State-of-the-Art Statistical Machine Translation , 2011, Lang. Linguistics Compass.

[77]  Makoto Nagao,et al.  A framework of a mechanical translation between Japanese and English by analogy principle , 1984 .

[78]  John Cocke,et al.  A Statistical Approach to Language Translation , 1988, COLING.

[79]  장윤희,et al.  Y. , 2003, Industrial and Labor Relations Terms.

[80]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[81]  Kristina Toutanova,et al.  A Discriminative Syntactic Word Order Model for Machine Translation , 2007, ACL.

[82]  Adam Lopez Tera-Scale Translation Models via Pattern Matching , 2008, COLING.

[83]  Pius ten Hacken,et al.  Has There Been a Revolution in Machine Translation? , 2001, Machine Translation.

[84]  Andy Way Machine translation using LFG-DOP , 2003 .