Lexicon-building methods for an acoustic sub-word based speech recognizer

The use of an acoustic subword unit (ASWU)-based speech recognition system for the recognition of isolated words is discussed. Some methods are proposed for generating the deterministic and the statistical types of word lexicon. It is shown that the use of a modified k-means algorithm on the likelihoods derived through the Viterbi algorithm provides the best deterministic-type of word lexicon. However, the ASWU-based speech recognizer leads to better performance with the statistical type of word lexicon than with the deterministic type. Improving the design of the word lexicon makes it possible to narrow the gap in the recognition performances of the whole word unit (WWU)-based and the ASWU-based speech recognizers considerably. Further improvements are expected by designing the word lexicon better.<<ETX>>

[1]  L. Rabiner,et al.  A modified K‐means clustering algorithm for use in speaker‐independent isolated word recognition , 1984 .

[2]  Michael Picheny,et al.  Acoustic Markov models used in the Tangora speech recognition system , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[3]  Kuldip K. Paliwal,et al.  An improved sub-word based speech recognizer , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[4]  V. R. Algazi,et al.  Automatic speech recognition using acoustic sub-words and no time alignment , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[5]  Chin-Hui Lee,et al.  Word recognition using whole word and subword models , 1989, International Conference on Acoustics, Speech, and Signal Processing,.

[6]  Frank K. Soong,et al.  High performance connected digit recognition using hidden Markov models , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7]  Lalit R. Bahl,et al.  A Maximum Likelihood Approach to Continuous Speech Recognition , 1983, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[8]  Lawrence R. Rabiner,et al.  A procedure to generate training sequences for a connected word recognizer using the segmental k-means training algorithm , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[9]  Lawrence R. Rabiner,et al.  A modified K-means clustering algorithm for use in isolated work recognition , 1985, IEEE Trans. Acoust. Speech Signal Process..

[10]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[11]  Torbjørn Svendsen,et al.  On the automatic segmentation of speech signals , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[12]  Biing-Hwang Juang,et al.  An investigation on the use of acoustic sub-word units for automatic speech recognition , 1987, ICASSP '87. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Frank K. Soong,et al.  A segment model based approach to speech recognition , 1988, ICASSP-88., International Conference on Acoustics, Speech, and Signal Processing.

[14]  Robert M. Gray,et al.  An Algorithm for Vector Quantizer Design , 1980, IEEE Trans. Commun..

[15]  Biing-Hwang Juang,et al.  A model-based connected-digit recognition system using either hidden Markov models or templates , 1986 .

[16]  Lalit R. Bahl,et al.  Recognition results for several experimental acoustic processors , 1979, ICASSP.