Candidate expansion algorithm based on weighted syllable confusion matrix for Mandarin LVCSR

The inclusion of more potentially correct words in the candidate sets is important to improve the accuracy of Large Vocabulary Continuous Speech Recognition (LVCSR). A candidate expansion algorithm based on the Weighted Syllable Confusion Matrix (WSCM) is proposed. First, WSCM is derived from a confusion network. Then, the recognised candidates in the confusion network is used to conjecture the most likely correct words based on WSCM, after which, the conjectured words are combined with the recognised candidates to produce an expanded candidate set. Finally, a combined model having mutual information and a trigram language model is used to rerank the candidates. The experiments on Mandarin film data show that an improvement of 9.57% in the character correction rate is obtained over the initial recognition performance on those light erroneous utterances.

[1]  Gary Geunbae Lee,et al.  Speech recognition error correction using maximum entropy language model , 2004, INTERSPEECH.

[2]  Yonghong Yan,et al.  Keyword Spotting Based on Syllable Confusion Network , 2007, Third International Conference on Natural Computation (ICNC 2007).

[3]  Mukund Padmanabhan,et al.  Error corrective mechanisms for speech recognition , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[4]  Sun Cheng-li Speech recognition error correction scheme based on divide-and-conquer , 2010 .

[5]  Wai Kit Lo,et al.  A multi-pass error detection and correction framework for Mandarin LVCSR , 2006, INTERSPEECH.

[6]  Lina Zhou,et al.  Supporting dictation speech recognition error correction: the impact of external information , 2011, Behav. Inf. Technol..

[7]  Daming Shi,et al.  A hybrid post-processing system for offline handwritten chinese character recognition based on a statistical language model , 2005, Int. J. Pattern Recognit. Artif. Intell..

[8]  Chew Lim Tan,et al.  Contextual post-processing based on the confusion matrix in offline handwritten Chinese script recognition , 2004, Pattern Recognit..

[9]  Rafid A. Sukkar,et al.  Correcting recognition errors via discriminative utterance verification , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[10]  Hitoshi Iida,et al.  A Method for Correcting Errors in Speech Recognition Using the Statistical Features of Character Co-occurence , 1998, COLING-ACL.

[11]  Jun Wu,et al.  Maximum entropy techniques for exploiting syntactic, semantic and collocational dependencies in language modeling , 2000, Comput. Speech Lang..

[12]  Frederick Jelinek,et al.  Structured language modeling , 2000, Comput. Speech Lang..

[13]  Biing-Hwang Juang,et al.  A training procedure for verifying string hypotheses in continuous speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[14]  Eric K. Ringger,et al.  Error correction via a post-processor for continuous speech recognition , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[15]  Helen M. Meng,et al.  A two-level schema for detecting recognition errors , 2004, INTERSPEECH.

[16]  Andrew Sears,et al.  Third-party error detection support mechanisms for dictation speech recognition , 2010, Interact. Comput..

[17]  David D. Palmer,et al.  Context-based Speech Recognition Error Detection and Correction , 2004, NAACL.

[18]  Gang Liu,et al.  Speech Recognition error correction by using combinational measures , 2012, 2012 3rd IEEE International Conference on Network Infrastructure and Digital Content.

[19]  Li,et al.  Semantic Knowledge Acquisition from Blogs with Tag-Topic Model , 2012 .

[20]  Gang,et al.  Novel Active Learning Method for Speech Recognition , 2010 .