Optimal filtering and smoothing for speech recognition using a stochastic target model

Presents a stochastic target model of speech production, where articulator motion in the vocal tract is represented by the state of a Markov-modulated linear dynamical system, driven by a piecewise-deterministic control trajectory and observed through a non-linear function representing the articulatory-acoustic mapping. Optimal filtering and smoothing algorithms for estimating the hidden states of the model from acoustic measurements are derived using a measure-change technique and require the solution of recursive integral equations. A sub-optimal approximation is developed and illustrated using examples taken from real speech.

[1]  Katsuhiko Shirai,et al.  ARTICULATORY MODEL AND THE ESTIMATION OF ARTICULATORY PARAMETERS BY NONLINEAR REGRESSION METHOD. , 1976 .

[2]  R. Moore,et al.  Explicit modelling of state occupancy in hidden Markov models for automatic speech recognition , 1985, ICASSP '85. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[3]  Y. Bar-Shalom Tracking and data association , 1988 .

[4]  R. Shumway,et al.  Dynamic linear models with switching , 1991 .

[5]  Martin Russell,et al.  A segmental HMM for speech pattern modelling , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[6]  J. R. Rohlicek,et al.  ML estimation of a stochastic linear system with the EM algorithm and its application to speech recognition , 1993, IEEE Trans. Speech Audio Process..

[7]  Xiaodong Sun,et al.  Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states , 1994, IEEE Trans. Speech Audio Process..

[8]  Mari Ostendorf,et al.  A dynamical system model for recognizing intonation patterns , 1995, EUROSPEECH.

[9]  Li Deng,et al.  Maximum-likelihood estimation for articulatory speech recognition using a stochastic target model , 1995, EUROSPEECH.

[10]  John B. Moore,et al.  A MEASURE CHANGE DERIVATION OF CONTINUOUS STATE BAUM-WELCH ESTIMATORS , 1995 .

[11]  Martin J. Russell,et al.  Speech recognition using a linear dynamic segmental HMM , 1995, EUROSPEECH.

[12]  Yifan Gong,et al.  Stochastic trajectory models for speech recognition: an extension to modelling time correlation , 1995, EUROSPEECH.

[13]  Atsushi Nakamura,et al.  Speech Recognition using Hidden Markov Models , 1998 .