A Multi-Perspective Evaluation of the NESPOLE! Speech-to-Speech Translation System

Performance and usability of real-world speech-to-speech translation systems, like the one developed within the NESPOLE! project, are affected by several aspects that go beyond the pure translation quality provided by the underlying components of the system. In this paper we describe these aspects as perspectives along which we have evaluated the NESPOLE! system. Four main issues are investigated: (1) assessing system performance under various network traffic conditions; (2) a study on the usage and utility of multi-modality in the context of multi-lingual communication; (3) a comparison of the features of the individual speech recognition engines, and (4) an end-to-end evaluation of the system.

[1]  Fabio Pianesi,et al.  NESPOLE!'s Multilingual and Multimodal Corpus , 2002, LREC.

[2]  Ben P. Milner,et al.  Robust speech recognition over IP networks , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Hagen Soltau,et al.  The ISL evaluation system for Verbmobil-II , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  Fabio Pianesi,et al.  Architecture and Design Considerations in NESPOLE!: a Speech Translation System for E-commerce Applications , 2001, HLT.

[5]  Hagen Soltau,et al.  Speech recognition over netmeeting connections , 2001, INTERSPEECH.

[6]  Florian Metze,et al.  The nespole! voIP dialogue database , 2001, INTERSPEECH.

[7]  Gianni Lazzari Spoken translation: challenges and opportunities , 2000, INTERSPEECH.

[8]  Fabio Pianesi,et al.  The NESPOLE! Speech-to-Speech Translation System , 2002, AMTA.

[9]  Taro Watanabe,et al.  Evaluation of a Practical Interlingua for Task-Oriented Dialogue , 2000 .