Integration of MLLR adaptation with pronunciation proficiency adaptation for non-native speech recognition

To recognize non-native speech, larger acoustic/linguistic distortions must be handled adequately in acoustic modeling, language modeling, lexical modeling, and/or decoding strategy. In this paper, a novel method to enhance MLLR adaptation of acoustic models for non-native speech recognition is proposed. In the case of native speech recognition, MLLR speaker adaptation was successfully introduced because it enables efficient adaptation with a small number of adaptation data by using a regression tree of Gaussian mixtures of HMMs. However, as for non-native speech, most of the cases, the regression tree built from the baseline HMMs does not match with pronunciation proficiency of a speaker. This paper provides a solution for this problem, where the speaker’s proficiency is automatically estimated and the tree suited for the proficiency is built, which can be viewed as proficiency adaptation. Recognition experiments show that MLLR with the new tree raises the averaged error reduction rate up to about 30 % from the baseline MLLR performance of approximately 20 %.