Unsupervised acoustic model adaptation based on phoneme error minimization

In this paper, a new decoding method for unsupervised acoustic model adaptation is presented. In unsupervised adaptation framework, the effectiveness of adaptation process is greatly affected by the mis-recognized labels. Therefore, selection of the adaptation data guided by the confidence measures is effective in unsupervised adaptation. We propose phoneme error minimization framework for exact phoneme labels and use of phoneme-level confidence measures for improved unsupervised adaptation. Experimental results showed that the proposed method could reduce the mis-recognized labels in the adaptation process, and consequently improved the adaptation accuracy. Furthermore, it was confirmed that the proposed method is effective in an iterative unsupervised adaptation framework.

[1]  Gerhard Rigoll,et al.  Frame-discriminative and confidence-driven adaptation for LVCSR , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[2]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[3]  Hermann Ney,et al.  Improved MLLR speaker adaptation using confidence measures for conversational speech recognition , 2000, INTERSPEECH.