Machine Translation Enhanced Automatic Speech Recognition

In human-mediated translation scenarios, a human interpreter translates between a source and a target language using either a spoken or a written representation of the source language. In this work the recognition performance on the speech of the human translator spoken in the target language (English) is improved by taking advantage of the source language (Spanish) representations. For this, machine translation techniques are used to translate between the source and target language resources and then bias the target language speech recognizer towards the gained knowledge, hence the name Machine Translation Enhanced Automatic Speech Recognition (MTE-ASR). Different basic MTE-ASR techniques are investigated, namely restricting the search vocabulary, selecting hypotheses from n-best lists and applying cache and interpolation schemes to language modeling. Given a written representation of the source language and with the help of a non-iterative combination of the most successful basic techniques, it was possible to outperform the English baseline ASR system by a relative word error rate reduction of 30.6%. In the case of a spoken source language representation, where a source language ASR has to be used at first to create a further processable written representation, the reduction is still 23.2%. With the help of an iterative system design, which recursively applies the improved ASR output to enhance the involved MT system(s) for a further ASR improvement, it was possible to further increase these word error rate reductions to 37.7% and 29.9% respectively.

[1]  A. Waibel,et al.  A one-pass decoder based on polymorphic linguistic context assignment , 2001, IEEE Workshop on Automatic Speech Recognition and Understanding, 2001. ASRU '01..

[2]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[3]  George R. Doddington,et al.  Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics , 2002 .

[4]  John D. Lafferty,et al.  Cheating with imperfect transcripts , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.

[5]  Roland Kuhn,et al.  French speech recognition in an automatic dictation system for translators: the transtalk project , 1995, EUROSPEECH.

[6]  Ron Zacharski,et al.  MT and Topic-Based Techniques to Enhance Speech Recognition Systems for Professional Translators , 2000, COLING.

[7]  Marc Dymetman,et al.  Towards an automatic dictation system for translators : the transtalk project , 1994, ICSLP.

[8]  Robert L. Mercer,et al.  The Mathematics of Statistical Machine Translation: Parameter Estimation , 1993, CL.

[9]  Robert L. Mercer,et al.  Automatic speech recognition in machine-aided translation , 1994, Comput. Speech Lang..

[10]  S. Young Large Vocabulary Continuous Speech Recognition : a ReviewSteve , 1996 .

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  Tanja Schultz,et al.  Issues in meeting transcription - the ISL meeting transcription system , 2004, INTERSPEECH.