论文信息 - Hidden Markov models for trajectory modeling

Hidden Markov models for trajectory modeling

Current state-of-the-art statistical speech recognition systems use hidden Markov models (HMM) for modeling the speech signal. However, it is well known that HMM's do not exploit the time-dependence in the speech process, since they are limited by the assumption of conditional independence of observations given the state sequence. Alternative techniques, such as segment modeling approaches, can e ectively exploit time-dependencies in the acoustic signal by discarding the observation independence assumption. However, losing the basic HMM structure is often a high computational price to pay for improved acoustic models. In this paper, we introduce the parallel path HMM that exploits the time-dependence in speech via parametric trajectory models while maintaining the HMM framework. We present preliminary results on Switchboard, a large vocabulary conversational speech recognition task, demonstrating both improved modeling and potential for improved recognition performance.

[1] Herbert Gish,et al. A segmental speech model with applications to word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2] Herbert Gish,et al. Parametric trajectory models for speech recognition , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[3] Jacob Goldberger,et al. Segmental modeling using a continuous mixture of nonparametric models , 1997, IEEE Trans. Speech Audio Process..

[4] Amro El-Jaroudi,et al. Multilingual speech recognition: the 1996 byblos callhome system , 1997, EUROSPEECH.

[5] Mari Ostendorf,et al. From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..

[6] Herbert Gish,et al. Parametric trajectory mixtures for LVCSR , 1998, ICSLP.

[7] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8] H. L. Hartley,et al. Manuscript Preparation , 2022 .

[9] John J. Godfrey,et al. SWITCHBOARD: telephone speech corpus for research and development , 1992, [Proceedings] ICASSP-92: 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing.