Investigating the role of machine translated text in ASR domain adaptation: Unsupervised and semi-supervised methods

This study investigates the use of machine translated text for ASR domain adaptation. The proposed methodology is applicable when domain-specific data is available in language X only, whereas the goal is to develop a domain-specific system in language Y. Two semi-supervised methods are introduced and compared with a fully unsupervised approach, which represents the baseline. While both unsupervised and semi-supervised approaches allow to quickly develop an accurate domain-specific ASR system, the semi-supervised approaches overpass the unsupervised one by 10% to 29% relative, depending on the amount of human post-processed data available. An in-depth analysis, to explain how the machine translated text improves the performance of the domain-specific ASR, is also given at the end of this paper.

[1]  Lori Lamel,et al.  Comparing SMT Methods for Automatic Generation of Pronunciation Variants , 2010, IceTAL.

[2]  Svetlana Segarceanu,et al.  ProtoLOGOS, system for Romanian language automatic speech recognition and understanding (ASRU) , 2009, 2009 Proceedings of the 5-th Conference on Speech Technology and Human-Computer Dialogue.

[3]  Tibor Fegyó,et al.  A morpho-graphemic approach for the recognition of spontaneous speech in agglutinative languages - like Hungarian , 2007, INTERSPEECH.

[4]  Sadaoki Furui,et al.  Development of a speech recognition system for Icelandic using machine translated text , 2008, SLTU.

[5]  Laurent Besacier,et al.  Automatic Speech Recognition for Under-Resourced Languages: Application to Vietnamese Language , 2009, IEEE Transactions on Audio, Speech, and Language Processing.

[6]  Jean-François Bonastre,et al.  Automatic transcription of Somali language , 2006, INTERSPEECH.

[7]  Barry Haddow,et al.  Improved Minimum Error Rate Training in Moses , 2009, Prague Bull. Math. Linguistics.

[8]  Fabrice Lefèvre,et al.  Combination of stochastic understanding and machine translation systems for language portability of dialogue systems , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Horia Cucu,et al.  Enhancing Automatic Speech Recognition for Romanian by Using Machine Translated and Web-based Text Corpora , 2011 .

[10]  C. Negrescu,et al.  AUTOMATIC DIACRITIC RESTORATION FOR A TTS-BASED E-MAIL READER APPLICATION , 2008 .

[11]  Salim Roukos,et al.  Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[12]  Thomas Pellegrini,et al.  Investigating automatic decomposition for ASR in less represented languages , 2006, INTERSPEECH.

[13]  Taro Watanabe,et al.  Language Model Adaptation with Additional Text Generated by Machine Translation , 2002, COLING.

[14]  Paul Deléglise,et al.  Grapheme to phoneme conversion using an SMT system , 2009, INTERSPEECH.

[15]  Horia CUCU,et al.  OPTIMIZATION METHODS FOR LARGE VOCABULARY , ISOLATED WORDS RECOGNITION IN ROMANIAN LANGUAGE , 2011 .

[16]  Hermann Ney,et al.  Joint-sequence models for grapheme-to-phoneme conversion , 2008, Speech Commun..