Comparative study on corpora for speech translation

This paper investigates issues in preparing corpora for developing speech-to-speech translation (S2ST). It is impractical to create a broad-coverage parallel corpus only from dialog speech. An alternative approach is to have bilingual experts write conversational-style texts in the target domain, with translations. There is, however, a risk of losing fidelity to the actual utterances. This paper focuses on balancing a tradeoff between these two kinds of corpora through the analysis of two newly developed corpora in the travel domain: a bilingual parallel corpus with 420 K utterances and a collection of in-domain dialogs using actual S2ST systems. We found that the first corpus is effective for covering utterances in the second corpus if complimented with a small number of utterances taken from monolingual dialogs. We also found that characteristics of in-domain utterances become closer to those of the first corpus when more restrictive conditions and instructions to speakers are given. These results suggest the possibility of a bootstrap-style of development of corpora and S2ST systems, where an initial S2ST system is developed with parallel texts, and is then gradually improved with in-domain utterances collected by the system as restrictions are relaxed

[1]  Keiichi Tokuda,et al.  XIMERA: a new TTS from ATR based on corpus-based technologies , 2004, SSW.

[2]  Eiichiro Sumita,et al.  Solutions to Problems Inherent in Spoken-language Translation: The ATR-MATRIX Approach , 1999 .

[3]  Hermann Ney,et al.  Statistical Methods for Machine Translation , 2000 .

[4]  Eiichiro Sumita Example-based machine translation using DP-matching between work sequences , 2001, DDMMT@ACL.

[5]  Yuji Matsumoto,et al.  Building a Paraphrase Corpus for Speech Translation , 2004, LREC.

[6]  Hermann Ney,et al.  Speech translation: coupling of recognition and translation , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[7]  Thierry Dutoit Corpus-Based Speech Synthesis , 2008 .

[8]  Shuntaro Isogai,et al.  Multi-class composite N-gram language model , 2003, Speech Commun..

[9]  Toshiyuki Takezawa,et al.  Analysis and effect of speaking style for dialogue speech recognition , 2003, 2003 IEEE Workshop on Automatic Speech Recognition and Understanding (IEEE Cat. No.03EX721).

[10]  Toshiyuki Takezawa,et al.  End-to-end evaluation in ATR-MATRIX: speech translation system between English and Japanese , 1999, EUROSPEECH.

[11]  Satoshi Nakamura,et al.  Automatic generation of non-uniform context-dependent HMM topologies based on the MDL criterion , 2003, INTERSPEECH.

[12]  Marcello Federico Evaluation frameworks for speech translation technologies , 2003, INTERSPEECH.

[13]  Susanne Johanna Jekat,et al.  Multilingual Verbmobil-Dialogs: Experiments, Data Collection and Data Analysis , 2000 .

[14]  Hitoshi Iida,et al.  A speech and language database for speech translation research , 1994, ICSLP.

[15]  Eiichiro Sumita Corpus-Centered Computation , 2002, Speech-to-Speech Translation@ACL.

[16]  Toshiyuki Takezawa,et al.  Collecting machine-translation-aided bilingual dialogues for corpus-based speech translation , 2003, INTERSPEECH.

[17]  Frederick Jelinek,et al.  Statistical methods for speech recognition , 1997 .

[18]  Alexander H. Waibel Speech translation: past, present and future , 2004, INTERSPEECH.

[19]  Yoshinori Sagisaka,et al.  Evaluation of the ATR-matrix speech translation system with a pair comparison method between the system and humans , 2000, INTERSPEECH.

[20]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[21]  Wolfgang Wahlster,et al.  Verbmobil: Foundations of Speech-to-Speech Translation , 2000, Artificial Intelligence.

[22]  Taro Watanabe,et al.  A Unified Approach in Speech-to-Speech Translation: Integrating Features of Speech recognition and Machine Translation , 2004, COLING.

[23]  Eiichiro Sumita,et al.  Creating corpora for speech-to-speech translation , 2003, INTERSPEECH.

[24]  Gianni Lazzari Spoken translation: challenges and opportunities , 2000, INTERSPEECH.