论文信息 - A Paradigm for Limited Vocabulary Speech Recognition Based on Redundant Spectro-Temporal Feature Sets

A Paradigm for Limited Vocabulary Speech Recognition Based on Redundant Spectro-Temporal Feature Sets

Speech recognition techniques have come to rely almost com-pletely on HMM based frameworks. In this paper, we present a novel paradigm for small-vocabulary speech recognition based on a recently proposed word spotting technique. Recent work using discriminative classiﬁers with ordered spectro-temporal features to detect the presence of keywords obtained encouraging improvements over HMM-based models. We propose to extend this approach to recognize continuous speech in our work. Our method uses discriminative models to predict which words are present in a speech signal and hypothesize their locations. A graph search using dynamic programming is then used to obtain the most likely sequence of words from the hypothesis set produced as a result of combining the results from the discriminative word classiﬁers. While this approach doesn’t perform as well as state-of-the-art ASR systems, it can be particularly useful for languages with small amounts of annotated data available.

[1] Gy Kovács,et al. Localized spectro-temporal features for noise-robust speech recognition , 2010, 2010 International Joint Conference on Computational Cybernetics and Technical Informatics.

[2] Kenneth Thomas Schutte,et al. Parts-based models and local features for automatic speech recognition , 2009 .

[3] Tony Ezzat,et al. Discriminative word-spotting using ordered spectro-temporal patch features , 2008, SAPA@INTERSPEECH.

[4] Mark J. F. Gales,et al. The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..

[5] Alex Acero,et al. Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[6] Mari Ostendorf,et al. From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..

[7] W. Russell,et al. Continuous hidden Markov modeling for speaker-independent word spotting , 1989, International Conference on Acoustics, Speech, and Signal Processing,.