Transitional speech units and their representation by regressive Markov states: applications to speech recognition

Transitional speech units for American English are proposed and constructed via assimilation of active articulatory features. The effectiveness of a feature-based system exploiting the transitional speech units is demonstrated in evaluation experiments, where the new system with use of quadratic regressive states is shown to achieve error rate reduction of 21% compared with the system using only static subphonemic units.

[1]  Noam Chomsky,et al.  The Sound Pattern of English , 1968 .

[2]  L. Baum,et al.  An inequality and associated maximization technique in statistical estimation of probabilistic functions of a Markov process , 1972 .

[3]  Elizabeth Caroline Sagey,et al.  The representation of features and relations in non-linear phonology , 1986 .

[4]  Louis Goldstein,et al.  Articulatory gestures as phonological units , 1989, Phonology.

[5]  Michael Picheny,et al.  Decision trees for phonological rules in continuous speech , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[6]  Li Deng,et al.  A generalized hidden Markov model with state-conditioned trend functions of time for the speech signal , 1992, Signal Process..

[7]  Oded Ghitza,et al.  Hidden Markov models with templates as non-stationary states: an application to speech recognition , 1993, Comput. Speech Lang..

[8]  Herbert Gish,et al.  A segmental speech model with applications to word spotting , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[9]  John Coleman,et al.  Acoustics of American English speech : a dynamic approach , 1993 .

[10]  Li Deng,et al.  Speech recognition using the atomic speech units constructed from overlapping articulatory features , 1994, EUROSPEECH.

[11]  K. Stevens,et al.  Feature geometry and the vocal tract , 1994, Phonology.

[12]  Xiaodong Sun,et al.  Speech recognition using hidden Markov models with polynomial regression functions as nonstationary states , 1994, IEEE Trans. Speech Audio Process..

[13]  Li Deng,et al.  Speaker-independent phonetic classification using hidden Markov models with mixtures of trend functions , 1997, IEEE Trans. Speech Audio Process..

[14]  Li Deng,et al.  Speaker-independent phonetic classification using hidden Markov models with state-conditioned mixtures of trend functions , 1997 .