Evolution of MT with the Web

Since the Cranfield-94 conference, we have come to a better understanding of the nature of MT systems by separately analyzing their linguistic, computational, and operational architectures. Also, thanks to the CxAxQ metatheorem, the systems’ inherent limits have been clarified, and design choices can now be made in an informed manner according to the translation situations. MT evaluation has also matured: tools based on reference translations are useful for measuring progress; those based on subjective judgments for estimating future usage quality; and task-related objective measures (such as post-editing distance) for measuring operational quality. Moreover, the same technological advances that have led to “Web 2.0 1 ” have brought several futuristic predictions to fruition. Free Web MT services have democratized assimilation MT beyond belief. Speech translation research has given rise to usable systems for restricted tasks running on PDAs or on mobile phones connected to servers. New man-machine interface techniques have made interactive disambiguation usable in large-coverage multimodal MT. Increases in computing power have made statistical methods workable, and have led to the possibility of building low-linguistic-quality but still useful MT systems by machine learning from aligned bilingual corpora (SMT, EBMT). In parallel, progress has been made in developing interlingua-based MT systems, using hybrid methods. Unfortunately, many misconceptions about MT have spread among the public, and even among MT researchers, because of ignorance of the past and present of MT R&D. A compensating factor is the willingness of end users to freely contribute to building essential parts of the linguistic knowledge needed to construct MT systems, whether corpus-related or lexical. Finally, some developments we anticipated fifteen years ago have not yet materialized, such as online writing tools equipped with interactive disambiguation, and as a corollary the possibility of transforming source documents into self-explaining documents (SEDs) and of producing corresponding SEDs fully automatically in several target languages. These visions should now be realized, thanks to the evolution of Web programming and multilingual NLP techniques, leading towards a true Semantic Web, “Web 3.0,” which will support “ubilingual” (ubiquitous multilingual) computing.

[1]  Christian Boitet,et al.  ASR and Translation for Under-Resourced Languages , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Christian Boitet,et al.  Toward Integrated Dictionaries for M(a)T: motivations and linguistic organization , 1986, COLING.

[3]  Christian Boitet,et al.  Corpus pour la TA : types, tailles et problèmes associés, selon leur usage et le type de système , 2007 .

[4]  Christian Boitet Machine Translation (MT) and Computer-Aided Translation (CAT) ― Lecture 5: Evaluation of MT and CAT systems for various operational architectures , 2009 .

[5]  Christian Boitet,et al.  Expert Systems And Other New Techniques In MT Systems , 1984, COLING.

[6]  William B. Dolan,et al.  MSR-MT: The Microsoft Research Machine Translation System , 2002, AMTA.

[7]  Christian Boitet,et al.  Coedition to Share Text Revision across Languages and Improve MT a Posteriori , 2002, COLING 2002.

[8]  Alan K. Melby,et al.  Multi-Level Translation Aids in a Distributed System , 1982, COLING.

[9]  Alan K. Melby Translators and Machines - Can they Cooperate? , 1981 .

[10]  Bowen Zhou,et al.  IBM MASTOR SYSTEM: Multilingual Automatic Speech-to-Speech Translator , 2006 .

[11]  Valerie Bellynck Multimodal Visualization of Geometrical Constructions , 1998, Workshop On Content Visualization And Intermedia Representations.

[12]  Christian Boitet,et al.  Machine Translation (MT) and Computer-Aided Translation (CAT) ― Lecture 3: Computational architectures of MT systems , 2009 .

[13]  Philipp Koehn,et al.  Moses: Open Source Toolkit for Statistical Machine Translation , 2007, ACL.

[14]  Christian Boitet,et al.  An Evaluation of UNL Usability for High Quality Multilingualization and Projections for a Future UNL++ Language , 2007, CICLing.

[15]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[16]  John Chandioux,et al.  MÉTÉO : un système à l’épreuve du temps , 1981 .

[17]  Christian Boitet Gradable Quality Translations through Mutualization of Human Translation and Revision, and UNL-Based MT and Coedition , 2005 .

[18]  Laurent Besacier,et al.  Exploitation d'un corpus bilingue comparable pour la création d'un système de traduction probabiliste Vietnamien - Français , 2009 .

[19]  Christian Boitet,et al.  SECTra_w.1: an Online Collaborative System for Evaluating, Post-editing and Presenting MT Translation Corpora , 2008, LREC.

[20]  Christian Boitet Méthodes d'acquisition lexicale en TAO : des dictionnaires spécialisés propriétaires aux bases lexicales généralistes et ouvertes , 2001 .

[21]  Alan K. Melby,et al.  ITS: Interactive Translation System , 1979, COLING.

[22]  Christian Boitet,et al.  Pros and Cons of the Pivot and Transfer Approaches in Multilingual Machine Translation , 1988 .

[23]  Christian Boitet,et al.  A Web-oriented System to Manage the Translation of an Online Encyclopedia , 2009, 2009 IEEE-RIVF International Conference on Computing and Communication Technologies.

[24]  Ch . Boitet Practical Speech Translation Systems will Integrate Human Expertise, Multimodal Communication, and Interactive Disambiguation , 1993 .

[25]  Christian Boitet,et al.  UNL Lexical Selection with Conceptual Vectors , 2002, LREC.

[26]  Philippe Langlais,et al.  From the real world to real words: the METEO case , 2005, EAMT.

[27]  Christian Boitet,et al.  PIVAX, an online contributive lexical database for heterogeneous MT systems using a lexical pivot , 2007 .

[28]  Xiuming Huang,et al.  A Machine Translation System for the Target Language Inexpert , 1990, COLING.

[29]  Mark Seligman,et al.  Rapid Portability among Domains in an Interactive Spoken Language Translation System , 2008, SPSCTPA@COLING.

[30]  Kyo Kageura,et al.  BEYTrans: A Free Online Collaborative Wiki-Based CAT Environment Designed for Online Translation Communities , 2007, PACLIC.

[31]  Philipp Koehn,et al.  Re-evaluating the Role of Bleu in Machine Translation Research , 2006, EACL.

[32]  Christian Boitet,et al.  Multilingual Dialogue-Based MT for monolingual authors: the LIDIA project and a first mockup , 1994, Machine Translation.

[33]  Christian Boitet Machine Translation (MT) and Computer-Aided Translation (CAT) ― Lecture 6: Corpora for hybrid MT/CAT systems , 2009 .

[34]  Christian Boitet,et al.  Machine Translation (MT) and Computer-Aided Translation (CAT) ― Lecture 2: Linguistic architectures of MT systems , 2009 .

[35]  Christian Boitet,et al.  Traduction automatisée fondée sur le dialogue et documents auto-explicatifs : bilan du projet LIDIA , 2006, Trait. Autom. des Langues.

[36]  Christian Boitet,et al.  Towards Personal MT: general design, dialogue structure, potential role of speech , 1990, COLING.

[37]  Christian Boitet,et al.  Four technical and organizational keys to handle more languages and improve quality (on demand) in MT , 2001, MTSUMMIT.

[38]  Christian Boitet,et al.  IWSLT-06: experiments with commercial MT systems and lessons from subjective evaluations , 2006, IWSLT.

[39]  Christian Boitet,et al.  The “Whiteboard” Architecture: A Way to Integrate Heterogeneous Components of NLP Systems , 1994, COLING.

[40]  W. J. Hutchins Machine Translation: Past, Present, Future , 1986 .

[41]  Mosleh Hmoud Al-Adhaileh,et al.  A Synchronization Structure of SSTC and Its Applications in Machine Translation , 2002, COLING 2002.

[42]  Christian Boitet,et al.  Spoken dialogue translation systems evaluation: results, new trends, problems and proposals , 2004, IWSLT.

[43]  Christian Boitet Machine Translation (MT) and Computer-Aided Translation (CAT) ― Lecture 1: Linguistic, computational and operational architectures of MT systems , 2009 .

[44]  Sergei Nirenburg,et al.  Knowledge-based machine translation , 1989, COLING.

[45]  Kyo Kageura,et al.  Main research issues in building web services for mutualized, non-commercial translation , 2005 .

[46]  Francis Brunet-Manquat Syntactic parser combination for improved dependency analysis , 2004 .

[47]  Christian Boitet,et al.  Bilingual Lexical Data Contributed by Language Teachers via a Web Service: Quality vs. Quantity , 2009, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[48]  Christian Boitet,et al.  Annotating Documents by Their Intended Meaning to Make Them Self Explaining: An Essential Progress for the Semantic Web , 2006, FQAS.

[49]  Christian Boitet,et al.  ITOLDU, a Web Service to Pool Technical Lexical Terms in a Learning Environment and Contribute to Multilingual Lexical Databases , 2005, CICLing.

[50]  Michael Picheny,et al.  A hand-held speech-to-speech translation system , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[51]  Hideo Watanabe,et al.  Relational-grammar-based generation in the JETS Japanese-English machine translation system , 1991, Machine Translation.

[52]  Andrei Popescu-Belis,et al.  Principles of Context-Based Machine Translation Evaluation , 2002, Machine Translation.

[53]  Hervé Blanchon Perspectives of DBMT for monolingual authors on the basis of LIDIA-1, an implemented mock-up , 1994, COLING.

[54]  Zaharin Yusoff Generation of Synthes Is Programs in Robra (Ariane) From String-Tree Correspondence Grammars (Or a Strategy for Synthesis in Machine Translation) , 1990, COLING 1990.

[55]  C. Boitet TA et TAO à Grenoble... 32 ans déjà , 1992 .

[56]  Christian Boitet,et al.  A Rationale for Using UNL as an Interlingua and More in Various Domains , 2005 .

[57]  Christian Boitet,et al.  Bernard Vauqois’ contribution to the theory and practice of building MT systems: a historical perspective , 2000, TMI.

[58]  Christian Boitet Un essai de réponse à quelques questions théoriques et pratiques liées à la traduction automatique : définition d'un système prototype , 1976 .

[59]  Christian Boitet,et al.  Dialogue-Based MT and self-explaining documents as an alternative to MAHT and MT of controlled languages , 1994, BCS.