Development of the SRI/nightingale Arabic ASR system

We describe the large vocabulary automatic speech recognition system developed for Modern Standard Arabic by the SRI/Nightingale team, and used for the 2007 GALE evaluation as part of the speech translation system. We show how system performance is affected by different development choices, ranging from text processing and lexicon to decoding system architecture design. Word error rate results are reported on broadcast news and conversational data from the GALE development and evaluation test sets.

[1]  Venkata Ramana Rao Gadde Modeling word durations , 2000, INTERSPEECH.

[2]  Andreas Stolcke,et al.  SRILM - an extensible language modeling toolkit , 2002, INTERSPEECH.

[3]  Jeff A. Bilmes,et al.  Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[4]  Hermann Ney,et al.  Multigram-based grapheme-to-phoneme conversion for LVCSR , 2003, INTERSPEECH.

[5]  Andreas Stolcke,et al.  An efficient repair procedure for quick transcriptions , 2004, INTERSPEECH.

[6]  Geoffrey Zweig,et al.  fMPE: discriminatively trained features for speech recognition , 2005, Proceedings. (ICASSP '05). IEEE International Conference on Acoustics, Speech, and Signal Processing, 2005..

[7]  Andreas Stolcke,et al.  Improved discriminative training using phone lattices , 2005, INTERSPEECH.

[8]  Sherif Abdou,et al.  Recent progress in Arabic broadcast news transcription at BBN , 2005, INTERSPEECH.

[9]  Nizar Habash,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[10]  Andreas Stolcke,et al.  Using MLP features in SRI's conversational speech recognition system , 2005, INTERSPEECH.

[11]  Jean-Luc Gauvain,et al.  Arabic Broadcast News Transcription Using a One Million Word Vocalized Vocabulary , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  Andreas Stolcke,et al.  Combining Discriminative Feature, Transform, and Model Training for Large Vocabulary Speech Recognition , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.

[13]  Hermann Ney,et al.  Advances in Arabic broadcast news transcription at RWTH , 2007, 2007 IEEE Workshop on Automatic Speech Recognition & Understanding (ASRU).

[14]  Geoffrey Zweig,et al.  The IBM 2006 Gale Arabic ASR System , 2007, 2007 IEEE International Conference on Acoustics, Speech and Signal Processing - ICASSP '07.