Language modeling for mixed language speech recognition using weighted phrase extraction

To train a code switching language model for mixed language speech recognition, we propose to assign weights to the sentence pairs in the parallel text data. The code switching language model which is composed of the code switching boundary prediction model, code switching translation model and reconstruction model is incorporated with a language for mixed language speech recognition. The code switching translation model which is trained using selected subsets of the sentence pairs in the parallel text data allows the decoder to make the decision whether a phrase is in the matrix language or in the embedded language. Moreover, we propose a weighting procedure while training the code switching translation model. We evaluate our methods on Mandarin-English code switching lecture speech and lunch conversations. Our proposed method reduces word error rate by a statistically significant 1.74% on the lecture speech, and by 1.29% on the lunch conversation over the conventional interpolated language model.

[1]  Chen-Yu Chiang,et al.  A study on Hakka and mixed Hakka-Mandarin speech recognition , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.

[2]  Ying Li,et al.  Asymmetric acoustic modeling of mixed language speech , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  Pascale Fung,et al.  Inversion Transduction Grammar Constraints for Mining Parallel Sentences from Quasi-Comparable Corpora , 2005, IJCNLP.

[4]  Jeff MacSwan,et al.  Code Switching and Grammatical Theory , 2008 .

[5]  Ying Li,et al.  Improved mixed language speech recognition using asymmetric acoustic model and language model with code-switch inversion constraints , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[6]  Dau-Cheng Lyu,et al.  Language identification on code-switching utterances using multiple cues , 2008, INTERSPEECH.

[7]  Haizhou Li,et al.  A first speech recognition system for Mandarin-English code-switch conversational speech , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[8]  Tan Lee,et al.  Semantics-based language modeling for Cantonese-English code-mixing speech recognition , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.

[9]  Jianfeng Gao,et al.  Domain Adaptation via Pseudo In-Domain Data Selection , 2011, EMNLP.

[10]  J. Gumperz Discourse strategies: Introduction , 1982 .

[11]  F. Coulmas,et al.  社会语言学通览 = The Handbook of sociolinguistics , 2001 .

[12]  Chung-Hsien Wu,et al.  Language boundary detection and identification of mixed-language speech based on MAP estimation , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[13]  Dekai Wu,et al.  Stochastic Inversion Transduction Grammars and Bilingual Parsing of Parallel Corpora , 1997, CL.

[14]  Francisco Gomes de Matos The handbook of bilingualism and multilingualism , 2013 .

[15]  Tan Lee,et al.  Detection of language boundary in code-switching utterances by bi-phone probabilities , 2004, 2004 International Symposium on Chinese Spoken Language Processing.

[16]  David Sankoff,et al.  A Formal Grammar for Code-Switching. CENTRO Working Papers 8. , 1980 .

[17]  William D. Lewis,et al.  Intelligent Selection of Language Model Training Data , 2010, ACL.

[18]  Lin-Shan Lee,et al.  An integrated framework for transcribing Mandarin-English code-mixed lectures with improved acoustic and language modeling , 2010, 2010 7th International Symposium on Chinese Spoken Language Processing.

[19]  Hermann Ney,et al.  A simple and effective weighted phrase extraction for machine translation adaptation , 2012, IWSLT.