Investigating the impact of phonetic cross language modeling on Arabic and English speech recognition

The lack of speech resources in the Arabic language is one of the most important obstacles facing speech researchers. Previously, we designed two Arabic and English automatic speech recognition systems (ASR) using two corpora: TIMIT for English language and West Point for Arabic language. Cross-language experiments were conducted using the two systems, and the results were determined with respect to the main class of phonemes' content in each language. As a continuation of that work, this paper analyzes the system errors of the two corpora by determining their strengths and weaknesses on system accuracies when swapping phonemes models.

[1]  Sid-Ahmed Selouani,et al.  Evaluating the MSA West Point Speech Corpus , 2009, Int. J. Comput. Process. Orient. Lang..

[2]  Kai Feng,et al.  The subspace Gaussian mixture model - A structured model for speech recognition , 2011, Comput. Speech Lang..

[3]  R. Bayeh,et al.  Broadcast News Transcription Baseline System using the NEMLAR database , 2006 .

[4]  Nizar Habash,et al.  Parsing Arabic Dialects , 2006, EACL.

[5]  Roger K. Moore,et al.  Cross-Language Phone Recognition when the Target Language Phoneme Inventory is not Known , 2011, INTERSPEECH.

[6]  Sid-Ahmed Selouani,et al.  Arabic and English speech recognition using cross-language acoustic models , 2012, 2012 11th International Conference on Information Science, Signal Processing and their Applications (ISSPA).

[7]  Elizabeth C. Botha,et al.  Cross-language use of acoustic information for automatic speech recognition , 2002, Speech Commun..

[8]  Fayez A. Alhargan,et al.  Saudi accented Arabic voice bank , 2008, ExLing.

[9]  Khalid Choukri,et al.  Network of Data Centres (NetDC): BNSC - An Arabic Broadcast News Speech Corpus , 2004, LREC.

[10]  Dani Byrd,et al.  Phonetic analyses of word and segment variation using the TIMIT corpus of American english , 1994, Speech Commun..

[11]  Nizar Habash,et al.  Spoken Arabic Dialect Identification Using Phonotactic Modeling , 2009, SEMITIC@EACL.

[12]  Tanja Schultz,et al.  Language-independent and language-adaptive acoustic modeling for speech recognition , 2001, Speech Commun..

[13]  Chafic Mokbel,et al.  Towards multilingual speech recognition using data driven source/target acoustical units association , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[14]  Sara H. Basson,et al.  NTIMIT: a phonetically balanced, continuous speech, telephone bandwidth speech database , 1990, International Conference on Acoustics, Speech, and Signal Processing.

[15]  E. Bryan George,et al.  CTIMIT: a speech corpus for the cellular environment with applications to automatic speech recognition , 1995, 1995 International Conference on Acoustics, Speech, and Signal Processing.