On the use of morphological analysis for dialectal Arabic speech recognition

Arabic has a large number of affixes that can modify a stem to form words. In automatic speech recognition (ASR) this leads to a high out-of-vocabulary (OOV) rate for typical lexicon size, and hence a potential increase in WER. This is even more pronounced for dialects of Arabic where additional affixes are often introduced and the available data is typically sparse. To address this problem we introduce a simple word decomposition algorithm which only requires a text corpus and a predefined list of affixes. Using this algorithm to create the lexicon for Iraqi Arabic ASR results in about 10% relative improvement in word error rate (WER). Also using the union of the segmented and unsegmented vocabularies and interpolating the corresponding language models results in further WER reduction. The net WER improvement is about 13%.

[1]  Geoffrey Zweig,et al.  Morpheme-Based Language Modeling for Arabic Lvcsr , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[2]  Petra Geutner,et al.  Using morphology towards better large-vocabulary speech recognition systems , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.

[3]  Franciska de Jong,et al.  Compound decomposition in dutch large vocabulary speech recognition , 2003, INTERSPEECH.

[4]  Ossama Emam,et al.  Language Model Based Arabic Word Segmentation , 2003, ACL.

[5]  Andreas Stolcke,et al.  Development of a conversational telephone speech recognizer for Levantine Arabic , 2005, INTERSPEECH.

[6]  Chafic Mokbel,et al.  On the use of morphological constraints in n-gram statistical language model , 2005, INTERSPEECH.

[7]  Andreas Stolcke,et al.  Morphology-based language modeling for arabic speech recognition , 2004, INTERSPEECH.

[8]  Ruhi Sarikaya,et al.  IBM Mastor: Multilingual Automatic Speech-To-Speech Translator , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[9]  Bing Xiang,et al.  Morphological Decomposition for Arabic Broadcast News Transcription , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[10]  Frederick Jelinek,et al.  Interpolated estimation of Markov source parameters from sparse data , 1980 .

[11]  Jeff A. Bilmes,et al.  Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[12]  Sherif Abdou,et al.  Recent progress in Arabic broadcast news transcription at BBN , 2005, INTERSPEECH.

[13]  Martha Larson,et al.  Compound splitting and lexical unit recombination for improved performance of a speech recognition system for German parliamentary speeches , 2000, INTERSPEECH.

[14]  André Berton,et al.  Compound words in large-vocabulary German speech recognition systems , 1996, Proceeding of Fourth International Conference on Spoken Language Processing. ICSLP '96.