论文信息 - A unified way in incorporating segmental feature and segmental model into HMM

A unified way in incorporating segmental feature and segmental model into HMM

There are two major approaches to speech recognition: frame-based and segment-based approach. The frame-based approach, e.g. HMM, assumes a statistical independence and an identical distribution of the observation in each state. In addition it incorporates weak duration constraints. The segment-based approach is computational expensive and rough modelling easily occurs if not much 'templates' are stored. This paper presents a new framework to incorporate the segmental feature and the segmental model in a unified way into frame-based HMM to exploit the advantage of both methods. In the modified Viterbi algorithm, frame-based information prunes out the most probable path at each segment level to which the segmental model can be applied with dramatically reduced computational load; at the same time, the segmental score refines the score obtained by the frame-based model at each level. In this way, the best path found in the end, by the Viterbi algorithm, is optimal.

Jun He | Henri Leich | Jun He | H. Leich

[1] Stephen E. Levinson,et al. Continuously variable duration hidden Markov models for speech analysis , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Chin-Hui Lee,et al. A frame-synchronous network search algorithm for connected word recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[3] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[4] Chin-Hui Lee. On the use of some robust modeling techniques for speech recognition , 1989 .

[5] Mari Ostendorf,et al. A stochastic segment model for phoneme-based continuous speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[6] Chiu-yu Tseng,et al. Isolated-utterance speech recognition using hidden Markov models with bounded state durations , 1991, IEEE Trans. Signal Process..

[7] Li Deng,et al. A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal , 1992, Signal Process..

[8] Oded Ghitza,et al. Hidden Markov models with templates as non-stationary states: an application to speech recognition , 1993, Comput. Speech Lang..

[9] George Zavaliagkos,et al. Comparative Experiments on Large Vocabulary Speech Recognition , 1993, HLT.

[10] Chafic Mokbel,et al. On-line adaptation of a speech recognizer to variations in telephone line conditions , 1993, EUROSPEECH.

[11] Jun He,et al. Combining stochastic trajectory model and discriminative feature in speech recognizer , 1994, Proceedings of ICASSP '94. IEEE International Conference on Acoustics, Speech and Signal Processing.