NIST Open Machine Translation 2008 Evaluation: Stanford University's System Description

This document describes Stanford University’s first entry into a NIST MT evaluation. Our entry to the 2008 evaluation mainly focused on establishing a competent baseline with a phrase-based system similar to (Och and Ney, 2004; Koehn et al., 2007). In a three-week effort prior to the evaluation, our attention focused on scaling up our system to exploit nearly all Chinese-English parallel data permissible under the constrained track, incorporating competitive language models into the decoder using Gigaword and Google n-grams, evaluating Chinese word segmentation models, and incorporating a document classifier as a pre-processing stage to the decoder.

[1]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[2]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[4]  Lucian Vlad Lita,et al.  tRuEcasIng , 2003, ACL.

[5]  Andrew McCallum,et al.  Chinese Segmentation and New Word Detection using Conditional Random Fields , 2004, COLING.

[6]  Wolfgang Macherey,et al.  Lattice-based Minimum Error Rate Training for Statistical Machine Translation , 2008, EMNLP.

[7]  Pat Langley,et al.  Editorial: On Machine Learning , 1986, Machine Learning.

[8]  Franz Josef Och,et al.  Minimum Error Rate Training in Statistical Machine Translation , 2003, ACL.

[9]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[10]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.

[11]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[12]  NeyHermann,et al.  A systematic comparison of various statistical alignment models , 2003 .

[13]  Christopher D. Manning,et al.  Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling , 2005, ACL.

[14]  Franz Josef Och,et al.  Statistical machine translation: from single word models to alignment templates , 2002 .

[15]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[16]  Andrew McCallum,et al.  Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data , 2001, ICML.

[17]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[18]  F ChenStanley,et al.  An Empirical Study of Smoothing Techniques for Language Modeling , 1996, ACL.

[19]  Hermann Ney,et al.  The Alignment Template Approach to Statistical Machine Translation , 2004, CL.

[20]  Chris Callison-Burch,et al.  Open Source Toolkit for Statistical Machine Translation: Factored Translation Models and Lattice Decoding , 2006 .

[21]  Mirella Lapata,et al.  Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics , 1999, ACL 1999.