Statistical Analysis of Arabic Phonemes Used in Arabic Speech Recognition

This study is specifically concerned with the statistical analysis of the Arabic phonemes due to its significant role in continuous Arabic Speech Recognition System (ASR). When building Arabic speech recognizer , the number of frames that a phoneme occupy, the phoneme boundary and the number of Hidden Markov Model necessary to represent the phoneme are greatly helpful in enhancing the recognition accuracy. In this paper we statically analyze KACST-5 hours corpus, which was used in Arabic speech recognition for both training and recognition. The results showed different set of tables and figures that are helpful for Arabic speech researchers. The paper comes up with a clustering graph for Arabic phonemes based on the median and a trigram table for all phonemes which represent the frequency of a phoneme to appear in trigram. The study was consistent and agreed with Arabic speech scientist's observations.