Neural network boundary refining for automatic speech segmentation
暂无分享,去创建一个
This work is an extension of a previous work in which an automatic speech segmentation and labeling system was proposed based on a hidden Markov model (HMM) speech recognizer followed by a fuzzy-logic boundary correction system. In this paper we explore the possibility of substituting that difficult to design fuzzy-logic system by a neural network (NN) based system that can be automatically trained. First, the whole fuzzy-logic boundary correction system, which used different rule sets for each kind of phonetic transition, has been substituted by a single NN. Results show that this single NN outperforms the complete fuzzy-logic system. Then, the possibility of using different NNs specialized in each kind of phonetic transition has been explored. Results are again clearly better than the results obtained with the fuzzy-logic system, but not clearly better than the results obtained with just one NN.
[1] Piero Cosi,et al. A preliminary statistical evaluation of manual and automatic segmentation discrepancies , 1991, EUROSPEECH.
[2] Peter Jackson,et al. Techniques for accurate automatic annotation of speech waveforms , 1998, ICSLP.
[3] Andrej Ljolje,et al. Automatic speech segmentation for concatenative inventory selection , 1994, SSW.
[4] Doroteo Torre Toledano,et al. Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules , 1998, SSW.