论文信息 - ROMANIAN CORPUS FOR SPEECH-TO-TEXT ALIGNMENT ANCA

ROMANIAN CORPUS FOR SPEECH-TO-TEXT ALIGNMENT ANCA –

In this paper we present the methodology employed in the creation of an aligned speech-to-text Romanian Corpus. The corpus uses recordings from the AMPERROM and AMPRom projects as well as ad-hoc recordings of continuous speech. The protocol for speech recording and labelling, as well as the manual annotation procedure, are described. The corpus is intended to be used for training a speech segmentation module and an automatic speech-to-text aligner module.

DIANA BIBIRI | DAN CRISTEA | LAURA PISTOL | LIVIU – ANDREI SCUTELNICU | ADRIAN TURCULE

[1] John A. Bullinaria. Text to phoneme alignment and mapping for speech technology: A neural networks approach , 2011, The 2011 International Joint Conference on Neural Networks.

[2] J.-D. S. Marsters,et al. Aligning Text and Phonemes for Speech Technology Applications Using an EM-Like Algorithm , 1997 .

[3] John-Paul Hosom,et al. Speaker-independent phoneme alignment using transition-dependent states , 2009, Speech Commun..