TC-STAR: New language resources for ASR and SLT purposes

In TC-STAR a variety of Language Resources (LR) is being produced. In this contribution we address the resources that have been created for Automatic Speech Recrognition and Spoken Language Translation. As yet, these are 14 LR in total: two training SLR for ASR (English and Spanish), three development LR and three evaluation LR for ASR (English, Spanish, Mandarin), and three development LR and three evaluation LR for SLT (English-Spanish, Spanish-English, Mandarin-English). In this paper we describe the properties, validation, and availability of these resources.

[1]  Antonio Bonafonte,et al.  TC-STAR: Specifications of Language Resources and Evaluation for Speech Synthesis , 2006, LREC.

[2]  Hermann Ney,et al.  Cross domain automatic transcription on the TC-STAR EPPS corpus , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..