A Romanian corpus for speech perception and automatic speech recognition

A speech corpus is available in Romanian to use as the common material in speech perception and automatic speech recognition. It consists of high-quality audio of 400 sentences spoken by each of 12 speakers. Utterances are simple, syntactically identical phrases such as "muta bronz cu p 2 agale." Preliminary intelligibility tests using the audio signals suggest that the collected speech is easily identifiable in quiet and low levels of noise. The corpus is annotated at the phoneme, syllable and word level and is available on the website for research use.

[1]  Jon Barker,et al.  Modelling speaker intelligibility in noise , 2007, Speech Commun..

[3]  Jon Barker,et al.  An audio-visual corpus for speech perception and automatic speech recognition. , 2006, The Journal of the Acoustical Society of America.

[4]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[5]  Odette Scharenborg,et al.  The interspeech 2008 consonant challenge , 2008, INTERSPEECH.

[6]  Torbjørn Svendsen,et al.  FonDat1: A Speech Synthesis Corpus for Norwegian , 2006, LREC.