论文信息 - The impact of phonological rules on Arabic speech recognition

The impact of phonological rules on Arabic speech recognition

The pronunciation variation is a well-known phenomenon that has been widely investigated for automatic speech recognition (ASR). The knowledge-based phonological rules are generally used to capture the accurate phonetic realization in order to minimize the mismatch between the ASR dictionary and the actual phonetic representation of the speech signal. For the Arabic ASR, there are a number of studies that employ these rules on Arabic ASR systems; however, little research has been devoted to measure the precise performance of each rule. In this paper, we aim at finding the exact effect of each rule as well as the rules that have no influence. We used the Carnegie Mellon University PocketSphinx speech recognizer with a new “in-house” modern standard Arabic speech corpus that contains 19 h for training and 3.7 h for testing. We evaluated the effect of three famous rules (Shadda, Tanween, and the solar letters). The experimental results do not show clear evidence that using phonological rules for ASR dictionary adaptation can enhance the performance for within-word pronunciation variation. The obtained results might be an indication to rethink or use other ASR performance aspects, such as cross-word pronunciation variation and the optimal phonemes set of the Arabic language.

Fawaz S. Al-Anzi | Dia AbuZeina

[1] Daniel Jurafsky,et al. Building multiple pronunciation models for novel words using exploratory computational phonology , 1995, EUROSPEECH.

[2] Kamaruzaman Jusoff,et al. Acoustic Pronunciation Variations Modeling for Standard Malay Speech Recognition , 2008, Comput. Inf. Sci..

[3] Minhwa Chung,et al. Morpheme-Based Modeling of Pronunciation Variation for Large Vocabulary Continuous Speech Recognition in Korean , 2007, IEICE Trans. Inf. Syst..

[4] Husni Al-Muhtaseb,et al. Arabic Phonetic Dictionaries for Speech Recognition , 2009, J. Inf. Technol. Res..

[5] Helmer Strik,et al. Improving the performance of a Dutch CSR by modeling within-word and cross-word pronunciation variation , 1999, Speech Commun..

[6] Fawaz S. Al-Anzi,et al. Stemming impact on Arabic text categorization performance: A survey , 2015, 2015 5th International Conference on Information & Communication Technology and Accessibility (ICTA).

[7] Kyuwoong Hwang,et al. Automatic generation of Korean pronunciation variants by multistage applications of phonological rules , 1998, ICSLP.

[8] Alfred Mertins,et al. Automatic speech recognition and speech variability: A review , 2007, Speech Commun..

[9] Dimitra Vergyri,et al. Automatic Diacritization of Arabic for Acoustic Modeling in Speech Recognition , 2004 .

[10] Wasfi G. Al-Khatib,et al. Within-word pronunciation variation modeling for Arabic ASRs: a direct data-driven approach , 2011, International Journal of Speech Technology.

[11] Nizar Habash,et al. A Corpus and Phonetic Dictionary for Tunisian Arabic Speech Recognition , 2014, LREC.