A variable duration acoustic segment HMM for hard-to-recognize words and phrases

The authors consider how hidden Markov models (HMM) can be modified to accommodate a segment-based representation, and how word and subword models can be combined to improve recognition performance. The investigation is conducted within the context of a system that attempts to spot confusable words and phrases in the VOYAGER continuous speech corpus. Specifically, the authors describe how segments of varying duration are determined such that measurements for estimating the model parameters can be made on these segments. The word-spotting system is also described in detail.<<ETX>>

[1]  Victor Zue,et al.  Detection and classification of phonemes using context-independent error back-propagation , 1990, ICSLP.

[2]  Mari Ostendorf,et al.  A stochastic segment model for phoneme-based continuous speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3]  James Glass,et al.  The VOYAGER speech understanding system: preliminary development and evaluation , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[4]  John Makhoul,et al.  Context-dependent modeling for acoustic-phonetic recognition of continuous speech , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Victor Zue,et al.  Collection and analysis of spontaneous and read corpora for spoken language system development , 1990, ICSLP.

[6]  Victor Zue,et al.  The MIT SUMMIT Speech Recognition System: A Progress Report , 1989, HLT.

[7]  K. Stevens Evidence for the role of acoustic boundaries in the perception of speech sounds , 1981 .

[8]  Mei-Yuh Hwang,et al.  The SPHINX speech recognition system , 1989, International Conference on Acoustics, Speech, and Signal Processing,.