8.4 Machine-aided Human Translation

Existing products now are those of Trados (MultiTerm), IBM (Translation Manager), and SITE-EuroLang (EuroLang Optimizer). They are available on PC/Windows, PS/OS2, or Unix-based workstations. The intended users are competent translators working in teams and linked through a local network. Each translator's workstation offers tools to: access a bilingual terminology. access a translation memory. submit parts ot the text to an MT server. These tools have to be completely integrated in the text processor. The software automatically analyzes the source text, and attaches keyboard shortcuts to the terms and sentences found in the terminogical data base and in the translation memory. One very important design decision is whether to offer a specific text processor, as in IBM's Translation Manager, or whether to use directly one or more text processors produced by third parties, as in EuroLang Optimizer. The server supports tools to: manage the common multilingual lexical data base (MLDB), often a multilingual terminological data base (MTDB), and the common translation memory, where previous translations are recorded. Here, concurrent access and strict validation procedures are crucial. manage the translation tasks (not always offered). Let us take the case of the most recent product, EuroLang Optimizer. One instance is available on Sun workstations under Unix. The server uses a standard DBMS (data base management system) (Oracle or Sybase) to support the terminological data base and the translation memory. The translator's workstations use Interleaf or Framemaker as text processors, while their data base functions are degraded versions of those of the servers, and are implemented directly in C++. In the other instance, the server runs on a PC under Windows NT, again with Oracle or Sybase, while the translator's workstations use Word 6 on PCs under Windows 3. Source languages currently include English, French, German, Italian and Spanish. There are 17 target languages (almost all languages written with the Latin character set). When a document has to be translated, it is preprocessed on the server, and sent to a translator's workstation with an associated kit, which contains the corresponding subsets of the dictionary and of the translation memory, as well as (optionally) translation proposals coming from a batch MT system. MAHTrelated functionalities are accessible through a supplementary menu (in the case of Word 6) and keyboard shortcuts dynamically associated with terms or full sentences. The translator may enrich the kit's lexicon. When translation is completed, the document is sent back to the server with its updated kit. On the server, the new translation pairs are added to the translation memory, and updates or additions to the dictionary are

[1]  Marie Meteer,et al.  Proceedings of the Seventh International Workshop on Natural Language Generation , 1994 .

[2]  Morton David Rau Language Identification by Statistical Analysis , 1974 .

[3]  Les E. Atlas,et al.  The challenge of spoken language systems: research directions for the nineties , 1995, IEEE Trans. Speech Audio Process..

[4]  Mary McGee Wood,et al.  Machine translation for monolinguals , 1988, COLING.

[5]  Christian Fluhr,et al.  About reformulation in full-text IRS , 1989, Inf. Process. Manag..

[6]  Ronald A. Cole,et al.  A segment-based approach to automatic language identification , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[7]  MARTIN KAY The Proper Place of Men and Machines in Language Translation , 2004, Machine Translation.

[8]  Y.K. Muthusamy,et al.  Reviewing automatic language identification , 1994, IEEE Signal Processing Magazine.

[9]  Shubha Kadambe,et al.  Spontaneous speech language identification with a knowledge of linguistics , 1994, ICSLP.

[10]  Ronald A. Cole,et al.  Perceptual benchmarks for automatic language identification , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[11]  Satoshi Sato,et al.  CTM: An Example-Based Translation Aid System , 1992, COLING.

[12]  Donna Harman,et al.  The First Text REtrieval Conference (TREC-1) , 1993 .

[13]  W. B. Cavnar,et al.  N-gram-based text categorization , 1994 .

[14]  Victor Zue,et al.  Recent improvements in an approach to segment-based automatic language identification , 1994, ICSLP.

[15]  Hiyan Alshawi,et al.  Translation by Quasi Logical Form Transfer , 1991, ACL.

[16]  Hitoshi Iida,et al.  Experiments and Prospects of Example-Based Machine Translation , 1991, ACL.

[17]  Manfred Stede,et al.  TECHDOC: Multilingual generation of online and offline instructional text , 1994, ANLP.

[18]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[19]  Seiichi Nakagawa,et al.  Diction for phoneme/syllable/word-category and identification of language using HMM , 1990, ICSLP.

[20]  Kazunori Muraki,et al.  PIVOT: Two-Phase Machine Translation System , 1987, MTSUMMIT.

[21]  Marc A. Zissman,et al.  Automatic language identification of telephone speech messages using phoneme recognition and N-gram modeling , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.

[22]  Yeshwant K. Muthusamy,et al.  A Segmental Approach to Automatic Language Identification , 1993 .

[23]  Hiroshi Uchida Fujitsu machine translation system: ATLAS , 1986, Future Gener. Comput. Syst..

[24]  Douglas-Val Ziegler The automatic identification of languages using linguistic recognition signals , 1992 .

[25]  Padma Ramesh,et al.  Language identification with embedded word models , 1994, ICSLP.

[26]  Sergei Nirenburg,et al.  Machine translation: theoretical and methodological issues , 1987 .

[27]  Vassilios Digalakis,et al.  A Speech To Speech Translation System Built From Standard Components , 1993, HLT.

[28]  Cecile Paris,et al.  Stylistic variation in multilingual instructions , 1994 .

[29]  Hiroshi Uchida,et al.  ATLAS II: a machine translation system using conceptual structures as an interlingua , 1988, TMI.

[30]  Hitoshi Iida,et al.  Cooperation between Transfer and Analysis in Example-Based Framework , 1992, COLING.

[31]  Eric Wehrli,et al.  The Ips System , 1992, COLING.

[32]  Muriel Vasconcellos,et al.  SPANAM and ENGSPAN: Machine Translation at the Pan American Health Organization , 1985, Comput. Linguistics.

[33]  Jun'ichi Tsujii,et al.  Machine Translation without a source text , 1990, COLING.

[34]  Roseane R. Velho Lopes A Utoma Ted Access To Multilingual Information a Brazilian case study , 1989 .

[35]  Keiko Horiguchi,et al.  Towards Spontaneous Speech Translation , 1994 .

[36]  A. House,et al.  Toward automatic identification of the language of an utterance. I. Preliminary methodological con , 1977 .

[37]  Carolyn Penstein Rosé,et al.  Speech--Language Integration In A Multi--Lingual Speech Translation System , 1994, AAAI 1994.

[38]  Peter Norvig,et al.  Verbmobih A Translation System for Face-to-Face Dialog , 1994 .

[39]  Margaret King,et al.  Machine translation today : the state of the art : proceedings of the Third Lugano Tutorial, Lugano, Switzerland, 2-7 April 1984 , 1987 .

[40]  Victor Zue,et al.  A Bilingual VOYAGER System , 1993, HLT.

[41]  Takehiro Nakayama Modeling Content Identification from Document Images , 1994, ANLP.

[42]  Pete Whitelock,et al.  Strategies for Interactive Machine Translation: the experience and implications of the UMIST Japanese project , 1986, COLING.

[43]  Richard Sproat,et al.  Efficient grammar processing for a spoken language translation system , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[44]  Klaus R. Schubert The Architecture of DLT - Interlingual or Double Direct? , 1988 .

[45]  John Lehrberger,et al.  Machine Translation: Linguistic characteristics of MT systems and general methodology of evaluation , 1988 .

[46]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[47]  Wolfgang Wahlster,et al.  Verbmobil: Translation of Face-To-Face Dialogs , 1993, MTSUMMIT.

[48]  Masaaki Nagata,et al.  ATR's speech translation system: ASURA , 1993, EUROSPEECH.

[49]  Seiichi Nakagawa,et al.  Three language identification methods based on HMMs , 1994, ICSLP.

[50]  Sergei Nirenburg,et al.  On knowledge-based machine translation , 1986, COLING 1986.

[51]  Paul Dalsgaard,et al.  Application of inter-language phoneme similarities for language identification , 1994, ICSLP.

[52]  Martin Kay,et al.  The MIND System , 1970 .

[53]  Hiroshi Maruyama An Interactive Japanese Parser for Machine Translation , 1990, COLING.

[54]  Etienne Barnard,et al.  Language identification of six languages based on a common set of broad phonemes , 1994, ICSLP.

[55]  Penelope Sibun,et al.  Language Determination: Natural Language Processing from Scanned Document Images , 1994, ANLP.

[56]  Toon Witkam DLT - an industrial R&D project for multilingual machine translation , 1988, COLING.

[57]  Ronald A. Cole,et al.  The OGI multi-language telephone speech corpus , 1992, ICSLP.

[58]  Carl Vogel,et al.  Proceedings of the 16th International Conference on Computational Linguistics , 1996, COLING 1996.

[59]  Alan K. Melby,et al.  Multi-Level Translation Aids in a Distributed System , 1982, COLING.

[60]  Steven DeGennaro,et al.  1.0 TANGORA - a large vocabulary speech recognition system for five languages , 1991, EUROSPEECH.

[61]  Jean-Luc Gauvain,et al.  Identification of Non-Linguistic Speech Features , 1993, HLT.

[62]  Gregor Thurmair,et al.  An Architecture Sketch of Eurotra-II , 1991, MTSUMMIT.

[63]  Jaime G. Carbonell,et al.  Knowledge-Based Machine Translation, The CMU Approach , 1987 .

[64]  Cécile Paris,et al.  Expressing Procedural Relationships in Multilingual Instructions , 1994, INLG.

[65]  Susumu Akamine,et al.  Multi-lingual Sentence Generation from the PIVOT Interlingua , 1991 .

[66]  Peter Henrich Language identification for the automatic grapheme-to-phoneme conversion of foreign words in a German text-to-speech system , 1989, EUROSPEECH.

[67]  Hermann Ney,et al.  Prototype systems for large-vocabulary speech recognition: polyglot and spicos , 1991, EUROSPEECH.

[68]  Alex Waibel,et al.  JANUS: a speech-to-speech translation system using connectionist and symbolic processing strategies , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[69]  Xiuming Huang,et al.  A Machine Translation System for the Target Language Inexpert , 1990, COLING.

[70]  Victor Sadler,et al.  Working With Analogical Semantics: Disambiguation Techniques in Dlt. , 1989 .

[71]  John Cocke,et al.  A Statistical Approach to Machine Translation , 1990, CL.

[72]  Jean-Luc Gauvain,et al.  Language identification using phone-based acoustic likelihoods , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.