Speaker-independent Mandarin plosive recognition with dynamic features and multilayer perceptrons

A new method for recognising plosives in isolated Mandarin syllables is discussed in the Letter. After automatically detecting the plosive segment of the input utterance, some dynamic features are extracted from its spectral parameter contours using orthonormal polynomial transforms. Next, an MLP trained with an algorithm based on a minimum error criterion is employed to distinguish plosives using these features. A promising recognition rate of 73.6% is achieved in a speaker-independent test using a database containing utterances of 110 syllables uttered by 100 speakers.

[1]  L. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1974, The Bell System Technical Journal.

[2]  Biing-Hwang Juang,et al.  New discriminative training algorithms based on the generalized probabilistic descent method , 1991, Neural Networks for Signal Processing Proceedings of the 1991 IEEE Workshop.

[3]  Sin-Horng Chen,et al.  Vector quantization of pitch information in Mandarin speech , 1990, IEEE Trans. Commun..

[4]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..

[5]  Lee-Min Lee,et al.  A study on the automatic recognition of voiceless unaspirated stops , 1991 .

[6]  K. W. Ng,et al.  Separation of fricatives from aspirated plosives by means of temporal spectral variation , 1985, IEEE Trans. Acoust. Speech Signal Process..

[7]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.