A survey of Data Driven Machine Translation

Machine Translation (MT) refers to the use of computers for translating automatically from one language to another. The differences between source and target languages and the inherent ambiguity of the source language itself make MT a very difficult problem. Traditional approaches to MT have relied on humans giving linguistic knowledge in the form of rules to transform text. Given the vastness of language, this is a highly knowledge intensive task. Corpus-based approaches to Machine Translation (MT) dominate the MT research field today, with Example-Based MT (EBMT) and Statistical MT (SMT) representing two different frameworks within the data-driven paradigm. Example Based MT is a radically different approach that involves matching of examples from large amounts of training data followed by adaptation and recombination. This survey provides an overview of MT techniques, and covers some of the realted work in Example Based and Statistical approaches to machine translation from 1984 to 2011. The report concludes with a brief discussion on example-based hybrid techniques, existing MT systems and MT evaluation criteria.

[1]  Sudip Kumar Naskar,et al.  Mitigating Problems in Analogy-based EBMT with SMT and vice versa: A Case Study with Named Entity Transliteration , 2010, PACLIC.

[2]  Stelios Piperidis,et al.  A Matching Technique in Example-Based Machine Translation , 1994, COLING.

[3]  John C. Reynolds,et al.  School of Computer Science , 1992 .

[4]  David G. Hays Proceedings of the 8th conference on Computational linguistics , 1980 .

[5]  D. W. Barron Machine Translation , 1968, Nature.

[6]  Makoto Nagao,et al.  A framework of a mechanical translation between Japanese and English by analogy principle , 1984 .

[7]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[8]  Sergei Nirenburg,et al.  Automatic Translation—A Survey of Different Approaches , 2003 .

[9]  Andy Way,et al.  Hybrid Example-Based SMT: the Best of Both Worlds? , 2005, ParallelText@ACL.

[10]  Sergei Nirenburg,et al.  Two Approaches to Matching in Example-Based Machine Translation , 1993, TMI.

[11]  Michael Carl Inducing Translation Templates for Example-Based Machine Translation , 1999 .

[12]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[13]  Andy Way,et al.  Recent Advances in Example-Based Machine Translation , 2004 .

[14]  Sudip Kumar Naskar,et al.  A review of EBMT using proportional analogies , 2009 .

[15]  Eiichiro Sumita Example-based machine translation using DP-matching between work sequences , 2001, DDMMT@ACL.

[16]  Osamu Furuse,et al.  Formalizing translation memories , 1999, MTSUMMIT.

[17]  James H. Martin,et al.  Speech and Language Processing, 2nd Edition , 2008 .

[18]  Eiichiro Sumita,et al.  Translating with Examples: A New Approach to Machine Translation , 2005 .

[19]  Harold L. Somers,et al.  Review Article: Example-based Machine Translation , 1999, Machine Translation.

[20]  Sub-phrasal matching and structural templates in example-based MT , 2007, TMI.

[21]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[22]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[23]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[24]  H. Altay Güvenir,et al.  Learning Translation Templates from Bilingual Translation Examples , 2004, Applied Intelligence.

[25]  Daniel Marcu,et al.  Statistical Phrase-Based Translation , 2003, NAACL.