Tone recognition of continuous speech of standard Chinese using neural network and tone nucleus model
暂无分享,去创建一个
A method is developed for recognizing lexical tone types of Standard Chinese syllables in continuous speech. Neural network (four-layered perceptron) is adopted as classifier. The method includes two steps; first recognizing tone types using prosodic features of voiced part, and then re-recognizing by viewing only on tone nucleus, which is a portion of the syllable showing rather stable fundamental frequency (F0) contour regardless of tone types of the preceding and following syllables. The voiced part (or tone nucleus) is divided into 20 segments, and F0, delta-F0, F0 slope and short-term energy of each segment are served as inputs to the neural network. In order to cope with tone coarticulation, prosodic feature parameters for the last 5 segments of the preceding syllable and the initial 5 segments of the following syllable are included in the neural network inputs. Information on syllable length is also added to the inputs. Tone recognition experiment was conducted for a female speaker's utterances included in HKU96 corpus. The average recognition rate was 86.5 % including neutral tone syllables, when the tone nucleus model was not used. It increased to 86.9 %, when the model was used. The obtained rate is higher by more than 3 points as compared to that obtained by the hidden-Markov-model-based tone recognizer developed by the authors formerly. Index Terms: tone recognition, tone nucleus model, neural network, Standard Chinese
[1] Sin-Horng Chen,et al. Tone recognition of continuous Mandarin speech based on neural networks , 1995, IEEE Trans. Speech Audio Process..
[2] Keikichi Hirose,et al. Tone Recognition of Chinese Dissyllables Using Hidden Markov Models , 1995, IEICE Trans. Inf. Syst..
[3] Keikichi Hirose,et al. Tone nucleus modeling for Chinese lexical tone recognition , 2004, Speech Commun..
[4] Hsiao-Chuan Wang,et al. Hidden Markov model for Mandarin lexical tone recognition , 1988, IEEE Trans. Acoust. Speech Signal Process..