Statistical Analysis of Arabic Phonemes for Continuous Arabic Speech Recognition

Although Arabic is the world’s second most spoken language in terms of the number of speakers, Arabic automatic speech recognition (AASR) did not receive the desired attention from the research community. In this paper, we introduce thorough statistical analysis of the Arabic phonemes from a widely used Arabic corpus that was developed by King Fahd University of Petroleum and Minerals (KFUPM) with support of King Abed Al-Aziz City for Science and Technology (KACST). We study various parameters, such as the number of frames a phoneme occupies, the phonemes frequency, the mean length in frames, the standard deviation, the mode, and the median of the phoneme boundary. In addition, other language-model related information such as the bigram information is also studied. The results showed that phonemes can be clustered into groups. Based on statistical information, one can design the most suitable HMM for each phoneme in terms of the number of states and other model parameters. Keywords—Phoneme; Arabic Speech Recognition; MFCC, Mode; Median; KACST Arabic speech corpus; HMM; Acoustic Model.

[1]  Paul Lamere,et al.  Design of the CMU Sphinx-4 Decoder , 2022 .

[2]  R. H. Myers,et al.  STAT 319 : Probability & Statistics for Engineers & Scientists Term 152 ( 1 ) Final Exam Wednesday 11 / 05 / 2016 8 : 00 – 10 : 30 AM , 2016 .

[3]  Laurie Bauer,et al.  Phoneme inventory size and population size , 2007 .

[4]  Gaurav Kumar Tak,et al.  Clustering Approach in Speech Phoneme Recognition Based on Statistical Analysis , 2010, CNSA.

[5]  Haitao Liu,et al.  Statistical Analysis of Chinese Phonemic Contrast , 2012, Phonetica.

[6]  P. Milenkovic,et al.  Statistical analysis of word-initial voiceless obstruents: preliminary data. , 1988, The Journal of the Acoustical Society of America.

[7]  James H. Martin,et al.  Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2nd Edition , 2000, Prentice Hall series in artificial intelligence.

[8]  Husni Al-Muhtaseb,et al.  Arabic Phonetic Dictionaries for Speech Recognition , 2009, J. Inf. Technol. Res..

[9]  Parminder Singh,et al.  Statistical syllables selection approach for the preparation of Punjabi speech database , 2010, 2010 International Conference for Internet Technology and Secured Transactions.

[10]  Xiaodong He,et al.  Discriminative Learning for Speech Recognition: Theory and Practice , 2008, Discriminative Learning for Speech Recognition.

[11]  Wasfi G. Al-Khatib,et al.  Cross-word Arabic pronunciation variation modeling for speech recognition , 2011, Int. J. Speech Technol..

[12]  Xiaodong He,et al.  Discriminative Learning for Speech Recognition: Theory and Practice , 2008, Discriminative Learning for Speech Recognition.

[13]  I A Maaly,et al.  New parameters for resolving acoustic confusability between Arabic phonemes in a phonetic HMM recognition system , 2002 .

[14]  Moustafa Elshafei,et al.  Techniques for high quality Arabic speech synthesis , 2002, Inf. Sci..

[15]  Marwan Al-Zabibi An acoustic-phonetic approach in automatic arabic speech recognition , 1990 .