论文信息 - Neural network boundary refining for automatic speech segmentation

Neural network boundary refining for automatic speech segmentation

This work is an extension of a previous work in which an automatic speech segmentation and labeling system was proposed based on a hidden Markov model (HMM) speech recognizer followed by a fuzzy-logic boundary correction system. In this paper we explore the possibility of substituting that difficult to design fuzzy-logic system by a neural network (NN) based system that can be automatically trained. First, the whole fuzzy-logic boundary correction system, which used different rule sets for each kind of phonetic transition, has been substituted by a single NN. Results show that this single NN outperforms the complete fuzzy-logic system. Then, the possibility of using different NNs specialized in each kind of phonetic transition has been explored. Results are again clearly better than the results obtained with the fuzzy-logic system, but not clearly better than the results obtained with just one NN.

Doroteo Torre Toledano

[1] Piero Cosi,et al. A preliminary statistical evaluation of manual and automatic segmentation discrepancies , 1991, EUROSPEECH.

[2] Peter Jackson,et al. Techniques for accurate automatic annotation of speech waveforms , 1998, ICSLP.

[3] Andrej Ljolje,et al. Automatic speech segmentation for concatenative inventory selection , 1994, SSW.

[4] Doroteo Torre Toledano,et al. Trying to mimic human segmentation of speech using HMM and fuzzy logic post-correction rules , 1998, SSW.