ROMANIAN CORPUS FOR SPEECH-TO-TEXT ALIGNMENT ANCA –
暂无分享,去创建一个
In this paper we present the methodology employed in the creation of an aligned speech-to-text Romanian Corpus. The corpus uses recordings from the AMPERROM and AMPRom projects as well as ad-hoc recordings of continuous speech. The protocol for speech recording and labelling, as well as the manual annotation procedure, are described. The corpus is intended to be used for training a speech segmentation module and an automatic speech-to-text aligner module.
[1] John A. Bullinaria. Text to phoneme alignment and mapping for speech technology: A neural networks approach , 2011, The 2011 International Joint Conference on Neural Networks.
[2] J.-D. S. Marsters,et al. Aligning Text and Phonemes for Speech Technology Applications Using an EM-Like Algorithm , 1997 .
[3] John-Paul Hosom,et al. Speaker-independent phoneme alignment using transition-dependent states , 2009, Speech Commun..