On the use of hidden Markov modelling for recognition of dysarthric speech.

Recognition of the speech of severely dysarthric individuals requires a technique which is robust to extraordinary conditions of high variability and very little training data. A hidden Markov model approach to isolated word recognition is used in an attempt to automatically model the enormous variability of the speech, while signal preprocessing measures and model modifications are employed to make better use of the existing data. Two findings are contrary to general experience with normal speech recognition. The first is that an ergodic model is found to outperform a standard left-to-right (Bakis) model structure. The second is that automated clipping of transitional acoustics in the speech is found to significantly enhance recognition. Experimental results using utterances of cerebral palsied persons with an array of articulatory abilities are presented.

[1]  L. R. Rabiner,et al.  A speaker-independent digit-recognition system , 1975, The Bell System Technical Journal.

[2]  J. Makhoul,et al.  Vector quantization in speech coding , 1985, Proceedings of the IEEE.

[3]  Gene H. Golub,et al.  Matrix computations , 1983 .

[4]  J. Makhoul,et al.  Linear prediction: A tutorial review , 1975, Proceedings of the IEEE.

[5]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[6]  S.E. Levinson,et al.  Structural methods in automatic speech recognition , 1985, Proceedings of the IEEE.

[7]  John R. Deller,et al.  'Quantized' hidden Markov models for efficient recognition of cerebral palsy speech , 1990, IEEE International Symposium on Circuits and Systems.

[8]  Lawrence R. Rabiner,et al.  An algorithm for determining the endpoints of isolated utterances , 1975, Bell Syst. Tech. J..

[9]  Stan Davis,et al.  Comparison of Parametric Representations for Monosyllabic Word Recognition in Continuously Spoken Se , 1980 .

[10]  John R. Deller,et al.  Advantages of a Givens rotation approach to temporally recursive linear prediction analysis of speech , 1989, IEEE Trans. Acoust. Speech Signal Process..

[11]  Jr. G. Forney,et al.  The viterbi algorithm , 1973 .

[12]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[13]  J. R. Deller,et al.  An alternative adaptive sequential regression algorithm and its application to the recognition of cerebral palsy speech , 1987 .

[14]  J.R. Deller,et al.  An AI-based communication system for motor and speech disabled persons: design methodology and prototype testing , 1989, IEEE Transactions on Biomedical Engineering.

[15]  L. R. Rabiner,et al.  On the application of vector quantization and hidden Markov models to speaker-independent, isolated word recognition , 1983, The Bell System Technical Journal.

[16]  J. Deller,et al.  Encouraging results in the automated recognition of cerebral palsy speech , 1988, IEEE Transactions on Biomedical Engineering.