Island-driven search using broad phonetic classes

Most speech recognizers do not differentiate between reliable and unreliable portions of the speech signal during search. As a result, most of the search effort is concentrated in unreliable areas. Island-driven search addresses this problem by first identifying reliable islands and directing the search out from these islands towards unreliable gaps. In this paper, we develop a technique to detect islands from knowledge of hypothesized broad phonetic classes (BPCs). Using this island/gap knowledge, we explore a method to prune the search space to limit computational effort in unreliable areas. In addition, we also investigate scoring less detailed BPC models in gap regions and more detailed phonetic models in islands. Experiments on both small and large scale vocabulary tasks indicate that our island-driven search strategy results in an improvement in recognition accuracy and computation time.

[1]  Wayne A. Lea,et al.  Trends in Speech Recognition , 1980 .

[2]  James R. Glass,et al.  Real-time probabilistic segmentation for segment-based speech recognition , 1998, ICSLP.

[3]  Tara N. Sainath,et al.  Broad phonetic class recognition in a Hidden Markov model framework using extended Baum-Welch transformations , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[4]  James R. Glass,et al.  Real-time telephone-based speech recognition in the Jupiter domain , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[5]  Jeff A. Bilmes,et al.  Attention shift decoding for conversational speech recognition , 2007, INTERSPEECH.

[6]  James R. Glass A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..

[7]  Tara N. Sainath,et al.  A comparison of broad phonetic and acoustic units for noise robust segment-based phonetic recognition , 2008, INTERSPEECH.

[8]  Timothy J. Hazen,et al.  Word and phone level acoustic confidence scoring , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[9]  David Pearce,et al.  The aurora experimental framework for the performance evaluation of speech recognition systems under noisy conditions , 2000, INTERSPEECH.

[10]  Tara N. Sainath Applications of broad class knowledge for noise robust speech recognition , 2009 .