Word Ordering with Phrase-Based Grammars

We describe an approach to word ordering using modelling techniques from statistical machine translation. The system incorporates a phrase-based model of string generation that aims to take unordered bags of words and produce fluent, grammatical sentences. We describe the generation grammars and introduce parsing procedures that address the computational complexity of generation under permutation of phrases. Against the best previous results reported on this task, obtained using syntax driven models, we report huge quality improvements, with BLEU score gains of 20+ which we confirm with human fluency judgements. Our system incorporates dependency language models, large n-gram language models, and minimum Bayes risk decoding.

[1]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[2]  Mikko Kurimo,et al.  Minimum Bayes Risk Combination of Translation Hypotheses from Alternative Morphological Decompositions , 2009, NAACL.

[3]  Shankar Kumar,et al.  Lattice Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2008, EMNLP.

[4]  Stephen Clark,et al.  Syntax-Based Word Ordering Incorporating a Large-Scale Language Model , 2012, EACL.

[5]  William J. Byrne,et al.  Efficient Path Counting Transducers for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices , 2010, ACL.

[6]  David Chiang,et al.  Hierarchical Phrase-Based Translation , 2007, CL.

[7]  Benoît Favre,et al.  StuMaBa : From Deep Representation to Surface , 2011, ENLG.

[8]  Johan Schalkwyk,et al.  OpenFst: A General and Efficient Weighted Finite-State Transducer Library , 2007, CIAA.

[9]  John DeNero,et al.  Model Combination for Machine Translation , 2010, HLT-NAACL.

[10]  Daniel Marcu,et al.  HyTER: Meaning-Equivalent Semantics for Translation Evaluation , 2012, NAACL.

[11]  William J. Byrne,et al.  Simple and Efficient Model Filtering in Statistical Machine Translation , 2012, Prague Bull. Math. Linguistics.

[12]  William J. Byrne,et al.  Hierarchical Phrase-Based Translation with Weighted Finite-State Transducers and Shallow-n Grammars , 2010, CL.

[13]  Michael White,et al.  Towards broad coverage surface realization with CCG , 2007, MTSUMMIT.

[14]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[15]  Ernst Althaus,et al.  Computing Locally Coherent Discourses , 2004, ACL.

[16]  Regina Barzilay,et al.  Inferring Strategies for Sentence Ordering in Multidocument News Summarization , 2002, J. Artif. Intell. Res..

[17]  Jinxi Xu,et al.  String-to-Dependency Statistical Machine Translation , 2010, CL.

[18]  Kevin Knight,et al.  Decoding Complexity in Word-Replacement Translation Models , 1999, Comput. Linguistics.

[19]  Michael White,et al.  Further Meta-Evaluation of Broad-Coverage Surface Realization , 2010, EMNLP.

[20]  M. J. Nederhof,et al.  IDL-Expressions: A Formalism for Representing and Parsing Finite Languages in Natural Language Processing , 2004, J. Artif. Intell. Res..

[21]  Stephen Clark,et al.  Syntax-Based Grammaticality Improvement using CCG and Guided Search , 2011, EMNLP.

[22]  Kevin Knight,et al.  Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[23]  Oliver Lemon,et al.  Learning what to say and how to say it: Joint optimisation of spoken dialogue management and natural language generation , 2011, Comput. Speech Lang..

[24]  Raymond J. Mooney,et al.  Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation , 2007, NAACL.

[25]  William J. Byrne,et al.  Hierarchical Phrase-based Translation Representations , 2011, EMNLP.

[26]  François Yvon,et al.  Computing Lattice BLEU Oracle Scores for Machine Translation , 2012, EACL.

[27]  Daniel Marcu,et al.  Towards Developing Generation Algorithms for Text-to-Text Applications , 2005, ACL.

[28]  Srinivas Bangalore,et al.  Exploiting a Probabilistic Hierarchical Model for Generation , 2000, COLING.

[29]  Daniel Marcu,et al.  Stochastic Language Generation Using WIDL-Expressions and its Application in Machine Translation and Summarization , 2006, ACL.

[30]  Dmitriy Genzel,et al.  Automatically Learning Source-side Reordering Rules for Large Scale Machine Translation , 2010, COLING.

[31]  Stephen Wan,et al.  Improving Grammaticality in Statistical Sentence Generation: Introducing a Dependency Spanning Tree Algorithm with an Argument Satisfaction Model , 2009, EACL.

[32]  Josef van Genabith,et al.  Dependency-Based N-Gram Models for General Purpose Sentence Realisation , 2008, COLING.

[33]  Hermann Ney,et al.  Discriminative Training and Maximum Entropy Models for Statistical Machine Translation , 2002, ACL.

[34]  Jason Eisner,et al.  Learning Linear Ordering Problems for Better Translation , 2009, EMNLP.

[35]  Leo Wanner,et al.  The Surface Realisation Task: Recent Developments and Future Plans , 2012, INLG.

[36]  William J. Byrne,et al.  Fluency Constraints for Minimum Bayes-Risk Decoding of Statistical Machine Translation Lattices , 2010, COLING.

[37]  Kevin Knight,et al.  Automatic Analysis of Rhythmic Poetry with Applications to Generation and Translation , 2010, EMNLP.

[38]  Anja Belz,et al.  The First Surface Realisation Shared Task: Overview and Evaluation Results , 2011, ENLG.

[39]  Benoit Favre,et al.  from deep representation to surface , 2011 .

[40]  Thorsten Brants,et al.  Large Language Models in Machine Translation , 2007, EMNLP.

[41]  Adwait Ratnaparkhi,et al.  Trainable Methods for Surface Natural Language Generation , 2000, ANLP.