Phrase extraction and rescoring in statistical machine translation

The lack of linguistically motivated translation units or phrase pairs in Phrase-based Statistical Machine Translation (PB-SMT) systems is a well-known source of error. One approach to minimise such errors is to supplement the standard PB-SMT models with phrase pairs extracted from parallel treebanks (linguistically annotated and aligned corpora). In this thesis, we extend the treebank-based phrase extraction framework with percolated dependencies – a hitherto unutilised knowledge source – and evaluate its usability through more than a dozen syntax-aware phrase extraction models. However, the improvement in system performance is neither consistent nor conclusive despite the proven advantages of linguistically motivated phrase pairs. This leads us to hypothesize that the PB-SMT pipeline is flawed as it often fails to access perfectly good phrase-pairs while searching for the highest scoring translation (decoding). A model error occurs when the highest-probability translation (actual output of a PB-SMT system) according to a statistical machine translation model is not the most accurate translation it can produce. In the second part of this thesis, we identify and attempt to trace these model errors across state-of-the-art PB-SMT decoders by locating the position of oracle translations (the translation most similar to a reference translation or expected output of a PB-SMT system) in the n-best lists generated by a PB-SMT decoder. We analyse the impact of individual decoding features on the quality of translation output and introduce two rescoring algorithms to minimise the lower ranking of oracles in the n-best lists. Finally, we extend our oracle-based rescoring approach to a reranking framework by rescoring the n-best lists with additional reranking features. We observe limited but optimistic success and conclude by speculating on how our oracle-based rescoring of n-best lists can help the PB-SMT system (supplemented with multiple treebank-based phrase extractions) get optimal performance out of linguistically motivated phrase pairs.

[1]  David Chiang,et al.  A Hierarchical Phrase-Based Model for Statistical Machine Translation , 2005, ACL.

[2]  Hermann Ney,et al.  N-Gram Posterior Probabilities for Statistical Machine Translation , 2006, WMT@HLT-NAACL.

[3]  Ralph Weischedel,et al.  A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION , 2005 .

[4]  Christof Monz,et al.  Syntactic discriminative language model rerankers for statistical machine translation , 2011, Machine Translation.

[5]  Franz Josef Och,et al.  A Systematic Comparison of Phrase-Based, Hierarchical and Syntax-Augmented Statistical MT , 2008, COLING.

[6]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[7]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[8]  Yanjun Ma,et al.  Oracle-based Training for Phrase-based Statistical Machine Translation , 2011 .

[9]  Kenji Yamada,et al.  Reranking for Large-Scale Statistical Machine Translation , 2008 .

[10]  Yang Liu,et al.  Dependency Forest for Statistical Machine Translation , 2010, COLING.

[11]  Mikel L. Forcada,et al.  Automatic induction of shallow-transfer rules for open-source machine translation , 2007, TMI.

[12]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[13]  Joakim Nivre,et al.  Inductive Dependency Parsing , 2006, Text, speech and language technology.

[14]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[15]  Barry Haddow,et al.  Interactive Assistance to Human Translators using Statistical Machine Translation Methods , 2009, MTSUMMIT.

[16]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[17]  Antonio Toral,et al.  DELiC4MT: A Tool for Diagnostic MT Evaluation over User-defined Linguistic Phenomena , 2012, Prague Bull. Math. Linguistics.

[18]  Yifan He,et al.  Improving the Objective Function in Minimum Error Rate Training , 2009, MTSUMMIT.

[19]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[20]  Anoop Sarkar,et al.  Discriminative Reranking for Machine Translation , 2004, NAACL.

[21]  Kevin Knight,et al.  Decoding Complexity in Word-Replacement Translation Models , 1999, Comput. Linguistics.

[22]  Dan I. Moldovan,et al.  Language Models and Reranking for Machine Translation , 2006, WMT@HLT-NAACL.

[23]  David M. Magerman Statistical Decision-Tree Models for Parsing , 1995, ACL.

[24]  Hermann Ney,et al.  An Evaluation Tool for Machine Translation: Fast Evaluation for MT Research , 2000, LREC.

[25]  Karolina Owczarzak A novel dependency-based evaluation metric for machine translation , 2008 .

[26]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[27]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[28]  Michael Collins,et al.  Three Generative, Lexicalised Models for Statistical Parsing , 1997, ACL.

[29]  Chin-Yew Lin,et al.  ORANGE: a Method for Evaluating Automatic Evaluation Metrics for Machine Translation , 2004, COLING.

[30]  Ventsislav Zhechev Unsupervised Generation of Parallel Treebanks through Sub-Tree Alignment , 2009, Prague Bull. Math. Linguistics.

[31]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[32]  Matthew G. Snover,et al.  A Study of Translation Edit Rate with Targeted Human Annotation , 2006, AMTA.

[33]  Makoto Nagao,et al.  A framework of a mechanical translation between Japanese and English by analogy principle , 1984 .

[34]  Jan Hajic,et al.  The Prague Dependency Treebank , 2003 .

[35]  Philipp Koehn,et al.  Statistical Significance Tests for Machine Translation Evaluation , 2004, EMNLP.

[36]  Philip Koehn,et al.  Statistical Machine Translation , 2010, EAMT.

[37]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[38]  Thomas R. G. Green,et al.  The necessity of syntax markers: Two experiments with artificial languages , 1979 .

[39]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[40]  Mary Hearne,et al.  Comparing Constituency and Dependency Representations for SMT Phrase-Extraction , 2008, JEPTALNRECITAL.

[41]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[42]  Andy Way,et al.  MaTrEx: The DCU MT System for WMT 2008 , 2008, WMT@ACL.

[43]  Declan Groves,et al.  Evaluating syntax-driven approaches to phrase extraction for MT , 2009 .

[44]  John Tinsley,et al.  Resourcing machine translation with parallel treebanks , 2009 .

[45]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[46]  Dan Klein,et al.  Learning Accurate, Compact, and Interpretable Tree Annotation , 2006, ACL.

[47]  Jinxi Xu,et al.  A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model , 2008, ACL.

[48]  Kevin Duh,et al.  Beyond Log-Linear Models: Boosted Minimum Error Rate Training for N-best Re-ranking , 2008, ACL.

[49]  Daniel Marcu,et al.  Re-structuring, Re-labeling, and Re-aligning for Syntax-Based Machine Translation , 2010, CL.

[50]  Robert D. van Valin,et al.  An Introduction to Syntax , 2001 .

[51]  Philipp Koehn,et al.  A Systematic Analysis of Translation Model Search Spaces , 2009, WMT@EACL.

[52]  Daniel Marcu,et al.  A Phrase-Based,Joint Probability Model for Statistical Machine Translation , 2002, EMNLP.

[53]  Adwait Ratnaparkhi,et al.  A Maximum Entropy Model for Part-Of-Speech Tagging , 1996, EMNLP.

[54]  Hermann Ney,et al.  Error Analysis of Statistical Machine Translation Output , 2006, LREC.

[55]  Alon Lavie,et al.  MT for Minority Languages Using Elicitation-Based Learning of Syntactic Transfer Rules , 2002, Machine Translation.

[56]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[57]  D. Bourigault,et al.  Syntex, analyseur syntaxique de corpus , 2005 .

[58]  Andy Way,et al.  Marker-Based Filtering of Bilingual Phrase Pairs for SMT , 2009, EAMT.

[59]  Jörg Tiedemann,et al.  News from OPUS — A collection of multilingual parallel corpora with tools and interfaces , 2009 .

[60]  Daniel Marcu,et al.  Fast and optimal decoding for machine translation , 2004, Artif. Intell..

[61]  François Yvon,et al.  Oracle decoding as a new way to analyze phrase-based machine translation , 2012, Machine Translation.

[62]  Hermann Ney,et al.  Are Very Large N-Best Lists Useful for SMT? , 2007, HLT-NAACL.

[63]  Ben Taskar,et al.  An End-to-End Discriminative Approach to Machine Translation , 2006, ACL.

[64]  Andy Way,et al.  Hybrid Example-Based SMT: the Best of Both Worlds? , 2005, ParallelText@ACL.

[65]  Sanjeev Khudanpur,et al.  Efficient Extraction of Oracle-best Translations from Hypergraphs , 2009, HLT-NAACL.

[66]  Kevin Duh,et al.  Learning to Translate with Multiple Objectives , 2012, ACL.

[67]  Andy Way,et al.  Using percolated dependencies for phrase extraction in SMT , 2009 .

[68]  Fei Xia,et al.  Automatic grammar generation from two different perspectives , 2001 .

[69]  Tsuyoshi Okita,et al.  Word alignment and smoothing methods in statistical machine translation: Noise, prior knowledge and overfitting , 2012 .

[70]  Andy Way,et al.  Exploiting Parallel Treebanks to Improve Phrase-Based Statistical Machine Translation , 2009, CICLing.

[71]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[72]  Alon Lavie,et al.  METEOR: An Automatic Metric for MT Evaluation with Improved Correlation with Human Judgments , 2005, IEEvaluation@ACL.

[73]  Claude E. Shannon,et al.  Recent Contributions to The Mathematical Theory of Communication , 2009 .

[74]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[75]  Adam Lopez,et al.  Translation as Weighted Deduction , 2009, EACL.

[76]  H. Ney,et al.  A novel string-to-string distance measure with applications to machine translation evaluation , 2003, MTSUMMIT.

[77]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[78]  Andy Way,et al.  Recent Advances in Example-Based Machine Translation , 2004 .

[79]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[80]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[81]  Jean Véronis,et al.  Evaluation of multilingual text alignment systems: the ARCADE II project , 2006, LREC.

[82]  Wei-Yun Ma,et al.  Where’s the Verb? Correcting Machine Translation During Question Answering , 2009, ACL.

[83]  Ming Zhou,et al.  Sentence Level Machine Translation Evaluation as a Ranking , 2007, WMT@ACL.

[84]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.