One-Pass Coarse-to-Fine Segmental Speech Decoding Algorithm

In this paper, a novel one-pass coarse-to-fine decoding algorithm is proposed to accelerate the speed of segment model (SM). The algorithm is originated from the segmentation similarity observation described in the paper and is specific for the SM based speech recognition. At each step, a coarse search is first implemented to get coarse segmentations and then a fine search is performed based on the derived segmentation information. This fast algorithm is successfully integrated into an SM based Mandarin LVCSR system and saves more than 50% decoding time without obvious influence on the recognition accuracy

[1]  Jeff Siu-Kei Au-Yeung,et al.  Sub-phonetic polynomial segment model for large vocabulary continuous speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[2]  Bo Xu,et al.  Coloring the speech utterance to accelerate the SM based LVCSR decoding , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[3]  Mari Ostendorf,et al.  From HMM's to segment models: a unified view of stochastic modeling for speech recognition , 1996, IEEE Trans. Speech Audio Process..

[4]  James R. Glass A probabilistic framework for segment-based speech recognition , 2003, Comput. Speech Lang..

[5]  Alex Acero,et al.  Spoken Language Processing: A Guide to Theory, Algorithm and System Development , 2001 .

[6]  Mari Ostendorf,et al.  A stochastic segment model for phoneme-based continuous speech recognition , 1989, IEEE Trans. Acoust. Speech Signal Process..

[7]  Bo Xu,et al.  Update progress of Sinohear: advanced Mandarin LVCSR system at NLPR , 2000, INTERSPEECH.

[8]  Mari Ostendorf,et al.  Fast algorithms for phone classification and recognition using segment-based models , 1992, IEEE Trans. Signal Process..