Stanford University's Submissions to the WMT 2014 Translation Task

We describe Stanford’s participation in the French-English and English-German tracks of the 2014 Workshop on Statistical Machine Translation (WMT). Our systems used large feature sets, word classes, and an optional unconstrained language model. Among constrained systems, ours performed the best according to uncased BLEU: 36.0% for French-English and 20.9% for English-German.

[1]  Kenneth Heafield,et al.  KenLM: Faster and Smaller Language Model Queries , 2011, WMT@EMNLP.

[2]  Beatrice Santorini,et al.  Building a Large Annotated Corpus of English: The Penn Treebank , 1993, CL.

[3]  Preslav Nakov,et al.  Optimizing for Sentence-Level BLEU+1 Yields Short Translations , 2012, COLING.

[4]  Christopher D. Manning,et al.  Feature-Rich Phrase-based Translation: Stanford University’s Submission to the WMT 2013 Translation Task , 2013, WMT@ACL.

[5]  Philipp Koehn,et al.  Dirt Cheap Web-Scale Parallel Text from the Common Crawl , 2013, ACL.

[6]  Philip C. Woodland,et al.  Efficient class-based language modelling for very large vocabularies , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  Shankar Kumar,et al.  Minimum Bayes-Risk Decoding for Statistical Machine Translation , 2004, NAACL.

[8]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[9]  Thorsten Brants,et al.  Distributed Word Clustering for Large Scale Class-Based Language Modeling in Machine Translation , 2008, ACL.

[10]  Christopher D. Manning,et al.  Fast and Adaptive Online Training of Feature-Rich Translation Models , 2013, ACL.

[11]  Christopher D. Manning,et al.  An Empirical Comparison of Features and Tuning for Phrase-based Machine Translation , 2014, WMT@ACL.

[12]  Christopher D. Manning,et al.  Generating Typed Dependency Parses from Phrase Structure Parses , 2006, LREC.

[13]  Kenneth Heafield,et al.  N-gram Counts and Language Models from the Common Crawl , 2014, LREC.

[14]  Lucian Vlad Lita,et al.  tRuEcasIng , 2003, ACL.

[15]  Christopher D. Manning,et al.  Parsing Models for Identifying Multiword Expressions , 2013, CL.

[16]  Hermann Ney,et al.  Improved backing-off for M-gram language modeling , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[17]  Christopher D. Manning,et al.  A Simple and Effective Hierarchical Phrase Reordering Model , 2008, EMNLP.

[18]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[19]  Philipp Koehn,et al.  Europarl: A Parallel Corpus for Statistical Machine Translation , 2005, MTSUMMIT.

[20]  Christopher D. Manning,et al.  Phrasal: A Toolkit for New Directions in Statistical Machine Translation , 2014, WMT@ACL.

[21]  Philipp Koehn,et al.  Scalable Modified Kneser-Ney Language Model Estimation , 2013, ACL.

[22]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[23]  Ben Taskar,et al.  Alignment by Agreement , 2006, NAACL.

[24]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[25]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[26]  Slav Petrov,et al.  Source-Side Classifier Preordering for Machine Translation , 2013, EMNLP.

[27]  Deniz Yuret,et al.  Instance Selection for Machine Translation using Feature Decay Algorithms , 2011, WMT@EMNLP.

[28]  Paolo Ferragina,et al.  Text Compression , 2009, Encyclopedia of Database Systems.