Partial change accent models for accented Mandarin speech recognition

Regional accents in Mandarin speech result mostly from partial phone changes due to the interlanguage system of non-native speakers. We propose partial change accent models based on accent-specific units with acoustic model reconstruction for accented Mandarin speech recognition. We use phonological rules of dialectical pronunciations together with likelihood ratio test to model actual accented variants rather than inherent phonetic confusions, recognizer errors or other data-specific variations. In order to avoid model confusion and lexical confusion with the increased unit inventory, we improve model resolution through reconstructing the pre-trained acoustic model by using the Gaussian mixtures from accent-specific unit models, where the accent-specific units are treated as hidden models. The effectiveness of this approach is evaluated on Cantonese accented Mandarin speech. Our proposed method yields a significant 4.4 % absolute word error rate (WER) reduction without sacrificing the performance of native speech recognition task. Our reconstructed model can be applied to a single system to handle both accented and native speech.

[1]  Pascale Fung,et al.  Fast accent identification and accented speech recognition , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[2]  Bo Xu,et al.  Mandarin accent adaptation based on context-independent/context-dependent pronunciation modeling , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[3]  Pascale Fung,et al.  Automatic phone set extension with confidence measure for spontaneous speech , 2003, INTERSPEECH.

[4]  Steve Young,et al.  The HTK book , 1995 .

[5]  Tanja Schultz,et al.  Comparison of acoustic model adaptation techniques on non-native speech , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[6]  Pascale Fung,et al.  Modeling partial pronunciation variations for spontaneous Mandarin speech recognition , 2002, Comput. Speech Lang..

[7]  T. Kamm,et al.  Pronunciation Modeling of Mandarin Casual Speech , 2000 .

[8]  Harriet J. Nock,et al.  Pronunciation modeling by sharing gaussian densities across phonetic models , 1999, EUROSPEECH.

[9]  Chao Huang,et al.  Accent modeling based on pronunciation dictionary adaptation for large vocabulary Mandarin speech recognition , 2000, INTERSPEECH.

[10]  Laura Mayfield Tomokiyo,et al.  Recognizing Non-Native Speech: Characterizing and Adapting to Non-Native Usage in LVCSR , 2001 .