Machine Translation of Closed Captions

Traditional Machine Translation (MT) systems are designed to translate documents. In this paper we describe an MT system that translates the closed captions that accompany most North American television broadcasts. This domain has two identifying characteristics. First, the captions themselves have properties quite different from the type of textual input that many MT systems have been designed for. This is due to the fact that captions generally represent speech and hence contain many of the phenomena that characterize spoken language. Second, the operational characteristics of the closed-caption domain are also quite distinctive. Unlike most other translation domains, the translated captions are only one of several sources of information that are available to the user. In addition, the user has limited time to comprehend the translation since captions only appear on the screen for a few seconds. In this paper, we look at some of the theoretical and implementational challenges that these characteristics pose for MT. We present a fully automatic large-scale multilingual MT system, ALTo. Our approach is based on Whitelock's Shake and Bake MT paradigm, which relies heavily on lexical resources. The system currently provides wide-coverage translation from English to Spanish. In addition to discussing the design of the system, we also address the evaluation issues that are associated with this domain and report on our current performance.

[1]  Kenneth Ward Church,et al.  Good applications for crummy machine translation , 1993, Machine Translation.

[2]  Fred Popowich,et al.  A Bootstrap Approach to Automatically Generating Lexical Transfer Rules , 1999, ArXiv.

[3]  Frederic Chaume Varela,et al.  Translating non-verbal information in dubbing , 1997 .

[4]  Fred Popowich,et al.  Pre-processing Closed Captions for Machine Translation , 2000 .

[5]  Susanne Heizmann,et al.  Review of Machine translation: an introductory guide by D. Arnold, L. Balkan, R. Lee Humphreys, S. Meijer, and L. Sadler. NCC Blackwell 1994. , 1995 .

[6]  Fred Popowich,et al.  Explanation-based Learning for Machine Translation , 1999, ArXiv.

[7]  M. Nagao,et al.  Machine translation from japanese into english , 1986, Proceedings of the IEEE.

[8]  Fred Popowich,et al.  Creating high-quality , large-scale bilingual knowledge bases using minimal resources , .

[9]  Nizar Habash,et al.  Generation from Lexical Conceptual Structures , 2000 .

[10]  長尾 真 Machine translation : how far can it go? , 1989 .

[11]  Fred Popowich,et al.  Inflectional Information in Transfer for Lexicalist MT , 1997 .

[12]  Davide Turcato Automatically Creating Bilingual Lexicons for Machine Translation from Bilingual Text , 1998, COLING-ACL.

[13]  Pete Whitelock,et al.  Shake-and-Bake Translation , 1992, COLING.

[14]  Scott McDonald,et al.  A lexicalist approach to the translation of colloquial text , 1997, TMI.

[15]  Fred Popowich,et al.  Reuse of linguistic resources in MT , 1998 .

[16]  Fred Popowich,et al.  Time-Constrained Machine Translation , 1998, AMTA.

[17]  Raymond W. Gibbs,et al.  How “Just” Gets Its Meanings: Polysemy and Context in Psychological Semantics , 1996 .

[18]  John L. Beaven ABSTRACT: Shake-and-Bake Machine Translation , 1992, COLING.

[19]  Harold L. Somers,et al.  An introduction to machine translation , 1992 .

[20]  Kenneth Ward Church A Stochastic Parts Program and Noun Phrase Parser for Unrestricted Text , 1988, ANLP.

[21]  Stephen Minnis Constructive machine translation evaluation , 2004, Machine Translation.

[22]  Doug Arnold,et al.  Machine Translation: An Introductory Guide , 1994 .

[23]  Bob Carpenter,et al.  The logic of typed feature structures , 1992 .

[24]  Geoffrey K. Pullum,et al.  Generalized Phrase Structure Grammar , 1985 .

[25]  Tom Michael Mitchell,et al.  Explanation-based generalization: A unifying view , 1986 .

[26]  Fred Popowich,et al.  A Unified Example-Based and Lexicalist Approach to Machine Translation , 1999, ArXiv.

[27]  Bob Carpenter,et al.  ALE : the attribute logic engine user's guide, version 2.0.1 , 1992 .

[28]  Albert Sydney Hornby,et al.  Oxford advanced learner\'s dictionary of current English / A S Hornby with A P Cowie, A C Gimson , 1975 .

[29]  John R. Pierce,et al.  Language and Machines: Computers in Translation and Linguistics , 1966 .

[30]  Eva I. Ejerhed,et al.  Finite state segmentation of discourse into clauses , 1996, Natural Language Engineering.

[31]  Fred Popowich A Chart Generator for Shake and Bake Machine Translation , 1996, Canadian Conference on AI.

[32]  Louisa Sadler,et al.  Evaluation: An assessment , 1993, Machine Translation.

[33]  Chris Brew,et al.  Letting the Cat Out of the Bag: Generation for Shake-and-Bake MT , 1992, COLING.