SpeechDat(E) - Eastern European Telephone Speech Databases

This paper describes the creation of five new telephony speech databases for Central and Eastern European lanuages within the SpeechDat(E) project. The 5 languages concerned are Czech, Polish, Slovak, Hungarian, and Russian. The databases follow SpeechDat-II specifications with some language specific adaptation. The present paper describes the differences between SpeechDat(E) and earlier SpeechDat projects with ragrd to databse items such as generation of phonetically rich sentences generation, speaker recruitement, etc. The collections of the DBs are in the finishing phase. The DBs will be validated by SPEX and will be distributed by ELRA.