INTERDEPENDENCE OF LANGUAGE MODELS AND DISCRIMINATIVE TRAINING

In this paper, the interdependence of language models and discriminative training for large vocabulary speech recognition is investigated. In addition, a constrained recognition approach using word graphs is presented for the efficient determination of alternative word sequences for discriminative training. Experiments have been carried out on the ARPA Wall Street Journal corpus. The recognition results for MMI training show a significant dependence on the context length of the language model used for training. Best results were obtained using a unigram language model for MMI training. No significant correlation has been observed between the language model choice for training and recognition.

[1]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[2]  Y.-L. Chow Maximum mutual information estimation of HMM parameters for continuous speech recognition using the N-best algorithm , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[3]  Renato De Mori,et al.  High performance connected digit recognition using maximum mutual information estimation , 1991, [Proceedings] ICASSP 91: 1991 International Conference on Acoustics, Speech, and Signal Processing.

[4]  Biing-Hwang Juang,et al.  Minimum error rate training based on N-best string models , 1993, 1993 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[5]  Régis Cardin,et al.  MMIE training for large vocabulary continuous speech recognition , 1994, ICSLP.

[6]  Renato De Mori,et al.  High-performance connected digit recognition using maximum mutual information estimation , 1994, IEEE Trans. Speech Audio Process..

[7]  Yves Normandin Maximum Mutual Information Estimation of Hidden Markov Models , 1996 .

[8]  Biing-Hwang Juang,et al.  Statistical and Discriminative Methods for Speech Recognition , 1996 .

[9]  Hermann Ney,et al.  A word graph algorithm for large vocabulary continuous speech recognition , 1994, Comput. Speech Lang..

[10]  Steve J. Young,et al.  MMIE training of large vocabulary recognition systems , 1997, Speech Communication.

[11]  Hermann Ney,et al.  The RWTH large vocabulary continuous speech recognition system , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[12]  Hermann Ney,et al.  A combined maximum mutual information and maximum likelihood approach for mixture density splitting , 1999, EUROSPEECH.