JANUS is a multi-lingual speech-to-speech translation system designed to facilitate communication between two parties engaged in a spontaneous conversation in a limited domain. In this paper we describe our methodology for evaluating translation performance. Our current focus is on end- to- end evaluations- the evaluation of the translation capabilities of the system as a whole. The main goal of our end-to-end evaluation procedure is to determine translation accuracy on a test set of previously unseen dialogues. Other goals include evaluating the effectiveness of the system in conveying domain-relevant information and in detecting and dealing appropriately with utterances (or portions of utterances) that are out-of-domain. End-to-end evaluations are performed in order to verify the general coverage of our knowledge sources, guide our development efforts, and to track our improvement over time. We discuss our evaluation procedures, the criteria used for assigning scores to translations produced by the system, and the tools developed for performing this task. Recent Spanish-to-English performance evaluation results are presented as an example.
[1]
Alon Lavie,et al.
GLR* – An Efficient Noise-skipping Parsing Algorithm For Context Free Grammars
,
1993,
IWPT.
[2]
Alexander H. Waibel,et al.
Interactive Translation of Conversational Speech
,
1996,
Computer.
[3]
Alon Lavie,et al.
An Integrated Heuristic Scheme for Partial Parse Evaluation
,
1994,
ACL.
[4]
Tanja Schultz,et al.
Janus: Towards Multilingual Spoken Language Translation
,
1995
.
[5]
Carolyn Penstein Rosé,et al.
Discourse Processing of Dialogues with Multiple Threads
,
1995,
ACL.
[6]
Finn Dag Buø,et al.
JANUS 93: towards spontaneous speech translation
,
1994,
Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.