Hybrid Architectures for Machine Translation Systems

Although some progress has been made on the quality of Machine Translation in recent years, there is still a significant potential for quality improvement. There has also been a shift in paradigm of machine translation, from “classical” rule-based systems like METAL or LMT1 towards example-based or statistical MT.2 It seems to be time now to evaluate the progress and compare the results of these efforts, and draw conclusions for further improvements of MT quality.The paper starts with a comparison between statistical MT (henceforth: SMT) and rule-based MT (henceforth: RMT) systems, and describes the set-up and the evaluation results; the second section analyses the strengths and weaknesses of the respective approaches, and the third one discusses models of an architecture for a hybrid system.

[1]  Deborah A. Coughlin,et al.  Correlating automated and human assessments of machine translation quality , 2003, MTSUMMIT.

[2]  Hermann Ney,et al.  A Systematic Comparison of Various Statistical Alignment Models , 2003, CL.

[3]  Hermann Ney,et al.  The Statistical Translation Module in the Verbmobil System , 2000, KONVENS.

[4]  Erich H. Steiner,et al.  On the Semantics of Focus Phenomena in Eurotra , 1988, COLING.

[5]  Hermann Ney,et al.  Statistical Methods for Machine Translation , 2000 .

[6]  Arul Menezes,et al.  Achieving commercial-quality translation with example-based methods , 2001, MTSUMMIT.

[7]  Ralph Grishman,et al.  Acquisition of Selectional Patterns , 1992, COLING.

[8]  Pete Whitelock,et al.  Shake-and-Bake Translation , 1992, COLING.

[9]  Ralph Grishman,et al.  Standards & best practice for multilingual computational lexicons: ISLE MILE and more” , 2002, LREC.

[10]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[11]  Andrei Popescu-Belis,et al.  Principles of Context-Based Machine Translation Evaluation , 2002, Machine Translation.

[12]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[13]  Arul Menezes,et al.  A best-first alignment algorithm for automatic extraction of transfer mappings from bilingual corpora , 2001, DDMMT@ACL.

[14]  Nicoletta Calzolari,et al.  SIMPLE: A General Framework for the Development of Multilingual Lexicons , 2000, LREC.

[15]  Gregor Thurmair Making term extraction tools usable , 2003, EAMT.

[16]  Hermann Ney,et al.  Statistical multi-source translation , 2001, MTSUMMIT.

[17]  Hermann Ney,et al.  Improving SMT quality with morpho-syntactic analysis , 2000, COLING.

[18]  Rémi Zajac,et al.  Rapid Development of Translation Tools: Application to Persian and Turkish , 2000, COLING.

[19]  Bianka Buschbeck-Wolf,et al.  VIRTEX - a German-Russian Translation Experiment , 1990, COLING.

[20]  Bogdan Babych,et al.  Improving Machine Translation Quality with Automatic Named Entity Recognition , 2003, Proceedings of the 7th International EAMT workshop on MT and other Language Technology Tools, Improving MT through other Language Technology Tools Resources and Tools for Building MT - EAMT '03.

[21]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[22]  Rémi Zajac,et al.  Rapid Development of Translation Tools , 1999 .

[23]  Alexander M. Fraser,et al.  Syntax for Statistical Machine Translation , 2003 .

[24]  Philipp Koehn,et al.  What’s New in Statistical Machine Translation , 2003, NAACL.

[25]  Krzysztof Jassem Semantic Classification of Adjectives on the Basis of their Syntactic Features in Polish and English , 2004, Machine Translation.

[26]  Michael C. McCord,et al.  A New Version of the Machine Translation System LMT , 1989 .