Statistical Machine Translation

This introductory text to statistical machine translation (SMT) provides all of the theories and methods needed to build a statistical machine translator, such as Google Language Tools and Babelfish. In general, statistical techniques allow automatic translation systems to be built quickly for any language-pair using only translated texts and generic software. With increasing globalization, statistical machine translation will be central to communication and commerce. Based on courses and tutorials, and classroom-tested globally, it is ideal for instruction or self-study, for advanced undergraduates and graduate students in computer science and/or computational linguistics, and researchers in natural language processing. The companion website provides open-source corpora and tool-kits.

[1]  Haim Gaifman,et al.  Dependency Systems and Phrase-Structure Systems , 1965, Inf. Control..

[2]  Richard Edwin Stearns,et al.  Syntax-Directed Transduction , 1966, JACM.

[3]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[4]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[5]  Karen Spärck Jones Experiments in relevance weighting of search terms , 1979, Inf. Process. Manag..

[6]  David J. Weir,et al.  The convergence of mildly context-sensitive grammar formalisms , 1990 .

[7]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[8]  Kenneth Ward Church,et al.  A Program for Aligning Sentences in Bilingual Corpora , 1993, CL.

[9]  Steven Abney,et al.  Parsing By Chunks , 1991 .

[10]  Robert L. Mercer,et al.  Word-Sense Disambiguation Using Statistical Methods , 1991, ACL.

[11]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[12]  Alon Itai,et al.  Word Sense Disambiguation Using a Second Language Monolingual Corpus , 1994, CL.

[13]  Eric Brill,et al.  Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part-of-Speech Tagging , 1995, CL.

[14]  I. Dan Melamed,et al.  Automatic Evaluation and Uniform Filter Cascades for Inducing N-Best Translation Lexicons , 1995, VLC@ACL.

[15]  Hsin-Hsi Chen,et al.  Machine Translation: An Integrated Approach , 1995 .

[16]  Pascale Fung,et al.  Compiling Bilingual Lexicon Entries From a Non-Parallel English-Chinese Corpus , 1995, VLC@ACL.

[17]  Reinhard Rapp,et al.  Identifying Word Translations in Non-Parallel Texts , 1995, ACL.

[18]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[19]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars, with Application to Segmentation, Bracketing, and Alignment of Parallel Corpora , 1995, IJCAI.

[20]  Hermann Ney,et al.  HMM-Based Word Alignment in Statistical Translation , 1996, COLING.

[21]  Raymond J. Mooney,et al.  Comparative Experiments on Disambiguating Word Senses: An Illustration of the Role of Bias in Machine Learning , 1996, EMNLP.

[22]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[23]  Ronald Rosenfeld,et al.  Statistical language modeling using the CMU-cambridge toolkit , 1997, EUROSPEECH.

[24]  Kevin Knight,et al.  Generation that Exploits Corpus-Based Statistical Knowledge , 1998, ACL.

[25]  Hinrich Schütze,et al.  Automatic Word Sense Discrimination , 1998, Comput. Linguistics.

[26]  V. Kubon,et al.  Two Useful Measures of Word Order Complexity , 1998, Workshop On Processing Of Dependency-Based Grammars.

[27]  Pascale Fung,et al.  An IR Approach for Translating New Words from Nonparallel, Comparable Texts , 1998, ACL.

[28]  Nancy Ide,et al.  Introduction to the Special Issue on Word Sense Disambiguation: The State of the Art , 1998, Comput. Linguistics.

[29]  Philip Resnik,et al.  Mining the Web for Bilingual Text , 1999, ACL.

[30]  Jörg Tiedemann,et al.  Automatic Construction of Weighted String Similarity Measures , 1999, EMNLP.

[31]  Reinhard Rapp,et al.  Automatic Identification of Word Translations from Unrelated English and German Corpora , 1999, ACL.

[32]  Mona T. Diab,et al.  An Unsupervised Method for Multilingual Word Sense Tagging Using Parallel Corpora , 2000, ACL 2000.

[33]  Mona T. Diab,et al.  A statistical word-level translation model for comparable corpora , 2000 .

[34]  Yuji Matsumoto,et al.  Acquisition of Phrase-level Bilingual Correspondence using Dependency Structure , 2000, COLING.

[35]  Hermann Ney,et al.  The Statistical Translation Module in the Verbmobil System , 2000, KONVENS.

[36]  Ferran Plà,et al.  Tagging and Chunking with Bigrams , 2000, COLING.

[37]  Srinivas Bangalore,et al.  Learning Dependency Translation Models as Collections of Finite-State Head Transducers , 2000, Computational Linguistics.

[38]  Sabine Schulte im Walde,et al.  Robust German Noun Chunking With a Probabilistic Context-Free Grammar , 2000, COLING.

[39]  Sabine Buchholz,et al.  Introduction to the CoNLL-2000 Shared Task Chunking , 2000, CoNLL/LLL.

[40]  Hermann Ney,et al.  Improved Statistical Alignment Models , 2000, ACL.

[41]  Thorsten Brants,et al.  TnT – A Statistical Part-of-Speech Tagger , 2000, ANLP.

[42]  Philipp Koehn,et al.  Estimating Word Translation Probabilities from Unrelated Monolingual Corpora Using the EM Algorithm , 2000, AAAI/IAAI.

[43]  Martin Plátek,et al.  Two-Way Restarting Automata and J-Monotonicity , 2001, SOFSEM.

[44]  Philipp Koehn,et al.  Knowledge Sources for Word-Level Translation Models , 2001, EMNLP.

[45]  Daniel Marcu,et al.  Towards a Unified Approach to Memory- and Statistical-Based Machine Translation , 2001, ACL.

[46]  Daniel Marcu,et al.  Fast Decoding and Optimal Decoding for Machine Translation , 2001, ACL.

[47]  Kenji Imamura,et al.  Hierarchical Phrase Alignment Harmonized with Parsing , 2001, NLPRS.

[48]  Hermann Ney,et al.  Toward hierarchical models for statistical machine translation of inflected languages , 2001, DDMMT@ACL.

[49]  Hinrich Schütze,et al.  Book Reviews: Foundations of Statistical Natural Language Processing , 1999, CL.

[50]  Philip Resnik,et al.  A Perspective on Word Sense Disambiguation Methods and Their Evaluation , 2002 .

[51]  Taro Watanabe,et al.  Statistical machine translation based on hierarchical phrase alignment. , 2002, TMI.

[52]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[53]  Jason Eisner,et al.  Learning Non-Isomorphic Tree Mappings for Machine Translation , 2003, ACL.

[54]  I. Dan Melamed,et al.  Multitext Grammars and Synchronous Parsers , 2003, NAACL.

[55]  Kevin Knight A Statistical MT Tutorial Workbook , 2003 .

[56]  Alexander M. Fraser,et al.  A Smorgasbord of Features for Statistical Machine Translation , 2004, NAACL.

[57]  Fernando Pereira,et al.  Non-Projective Dependency Parsing using Spanning Tree Algorithms , 2005, HLT.

[58]  Joakim Nivre,et al.  Dependency Grammar and Dependency Parsing , 2005 .

[59]  Chris Quirk,et al.  Dependency Treelet Translation: Syntactically Informed Phrasal SMT , 2005, ACL.

[60]  David Chiang,et al.  An Introduction to Synchronous Grammars , 2006 .

[61]  Daniel Marcu,et al.  Scalable Inference and Training of Context-Rich Syntactic Translation Models , 2006, ACL.

[62]  Frantisek Mráz,et al.  On the Complexity of 2-Monotone Restarting Automata , 2004, Theory of Computing Systems.

[63]  Marco Kuhlmann,et al.  Mildly Context-Sensitive Dependency Languages , 2007, ACL.

[64]  Adam Lopez,et al.  Statistical machine translation , 2008, AMTA.