Hierarchical phoneme discrimination by hidden Markov modelling using cepstrum and formant information

A report is presented of comparative results for vowel classification using hidden Markov models based on linear predictive coding (LPC)-based cepstral vectors and formant features. The classification accuracy is shown to be significantly improved by using time duration constraints in formant feature space, especially for the formant mel-frequency representation and its time derivative. The highest vowel recognition accuracy is obtained by integrating the two feature spaces, multiplying the probabilities computed in the separate feature spaces. This improvement of vowel recognition is extended to the more general phoneme recognition task by use of a hierarchical feature integration method, which utilizes the vowel recognition results in formant feature space together with consonant recognition based on the LPC-based cepstral feature space.<<ETX>>

[1]  Hsiao-Wuen Hon,et al.  Large-vocabulary speaker-independent continuous speech recognition using HMM , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[2]  Lalit R. Bahl,et al.  Experiments with the Tangora 20,000 word speech recognizer , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  F. Jelinek,et al.  Continuous speech recognition by statistical methods , 1976, Proceedings of the IEEE.

[4]  John Makhoul,et al.  BYBLOS: The BBN continuous speech recognition system , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  M. Jack,et al.  Globally optimising formant tracker using generalised centroids , 1987 .

[6]  Stephen E. Levinson,et al.  Continuously variable duration hidden Markov models for automatic speech recognition , 1986 .

[7]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[8]  A. Cook,et al.  Experimental evaluation of duration modelling techniques for automatic speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.