论文信息 - Decoding in Joshua: Open Source, Parsing-Based Machine Translation - 字舞流文

Decoding in Joshua: Open Source, Parsing-Based Machine Translation

Decoding in Joshua: Open Source, Parsing-Based Machine Translation We describe a scalable decoder for parsing-based machine translation. The decoder is written in Java and implements all the essential algorithms described in (Chiang, 2007) and (Li and Khudanpur, 2008b): chart-parsing, n-gram language model integration, beam- and cube-pruning, and k-best extraction. Additionally, parallel and distributed computing techniques are exploited to make it scalable. We demonstrate experimentally that our decoder is more than 30 times faster than a baseline decoder written in Python.

Sanjeev Khudanpur | Chris Callison-Burch | Zhifei Li | Wren N. G. Thornton | S. Khudanpur | Chris Callison-Burch | Zhifei Li

[1] David Chiang,et al. Forest Rescoring: Faster Decoding with Integrated Language Models , 2007, ACL.

[2] Hermann Ney,et al. A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[3] Wolfgang Macherey,et al. Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[4] Yang Liu,et al. Tree-to-String Alignment Template for Statistical Machine Translation , 2006, ACL.

[5] Miles Osborne,et al. Randomised Language Modelling for Statistical Machine Translation , 2007, ACL.

[6] Adam Lopez,et al. Hierarchical Phrase-Based Translation with Suffix Arrays , 2007, EMNLP.

[7] Daniel Marcu,et al. Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[8] Sanjeev Khudanpur,et al. A Scalable Decoder for Parsing-Based Machine Translation with Equivalent Language Model State Maintenance , 2008, SSST@ACL.

[9] Haizhou Li,et al. I2r multi-pass machine translation system for IWSLT 2008 , 2008, IWSLT.

[10] S. Khudanpur,et al. Large-scale Discriminative n-gram Language Models for Statistical Machine Translation , 2008, AMTA.

[11] Omar Zaidan,et al. Z-MERT: A Fully Configurable Open Source Tool for Minimum Error Rate Training of Machine Translation Systems , 2009, Prague Bull. Math. Linguistics.

[12] Barry Haddow,et al. Improved Minimum Error Rate Training in Moses , 2009, Prague Bull. Math. Linguistics.

[13] M. F.,et al. Bibliography , 1985, Experimental Gerontology.

[14] Philipp Koehn,et al. Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[15] Liang Huang,et al. Statistical Syntax-Directed Translation with Extended Domain of Locality , 2006, AMTA.

[16] David Chiang,et al. An Introduction to Synchronous Grammars , 2006 .

[17] David Chiang,et al. Hierarchical Phrase-Based Translation , 2007, CL.

[18] David Chiang,et al. Better k-best Parsing , 2005, IWPT.

[19] Stephan Vogel,et al. An Efficient Two-Pass Approach to Synchronous-CFG Driven Statistical MT , 2007, NAACL.

[20] Philip Resnik,et al. Soft Syntactic Constraints for Hierarchical Phrased-Based Translation , 2008, ACL.

[21] Chris Callison-Burch,et al. Scaling Phrase-Based Statistical Machine Translation to Larger Corpora and Longer Phrases , 2005, ACL.

[22] Franz Josef Och,et al. Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[23] Jason Eisner,et al. Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[24] Chris Quirk,et al. Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[25] Andreas Stolcke,et al. SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[26] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.