Speaker environment classification using rhythm metrics in Levantine Arabic dialect

This paper investigates the relationship between rhythm metrics and the ability to classify speakers depending on gender and/or social environments that may have been affected by factors such as second language effects and ways of living as expressed through speech. The BBN/AUB (BBN Technologies and American University of Beirut) corpus was used; it contains four subsets of native Levantine dialect speakers of both genders from different locations. Classification was conducted using rhythm metrics and artificial neural networks (ANNs). The ANN classifier results showed 65.22% accuracy using only the Interval Measures metrics. The ANN classifier was able to reach 70.79% accuracy when all 11 rhythm metrics were used.

[1]  Sid-Ahmed Selouani,et al.  Investigation of emotion classification using speech rhythm metrics , 2013, 2013 IEEE Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE).

[2]  Judith Rosenhouse,et al.  Arabic Dialects and Maltese , 1997 .

[3]  Keiichi,et al.  A COMPARATIVE STUDY OF SPEECH RHYTHM IN ARABIC , ENGLISH , AND JAPANESE , 1999 .

[4]  Salem Ghazali,et al.  Speech Rhythm Variation in Arabic Dialects , 2002 .

[5]  Nizar Habash,et al.  Spoken Arabic Dialect Identification Using Phonotactic Modeling , 2009, SEMITIC@EACL.

[6]  Jeff A. Bilmes,et al.  Novel approaches to Arabic speech recognition: report from the 2002 Johns-Hopkins Summer Workshop , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[7]  Ole Tange,et al.  GNU Parallel: The Command-Line Power Tool , 2011, login Usenix Mag..

[8]  Wasfi G. Al-Khatib,et al.  Detection of Questions in Arabic Audio Monologues Using Prosodic Features , 2007, Ninth IEEE International Symposium on Multimedia (ISM 2007).

[9]  Yousef Ajami Alotaibi,et al.  Comparative evaluation of two arabic speech corpora , 2010, Proceedings of the 6th International Conference on Natural Language Processing and Knowledge Engineering(NLPKE-2010).

[10]  S. Spitzer,et al.  Quantifying speech rhythm abnormalities in the dysarthrias. , 2009, Journal of speech, language, and hearing research : JSLHR.

[11]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[12]  Mark Hasegawa-Johnson,et al.  A Baseline Speech Recognition System for Levantine Colloquial Arabic , 2012 .

[13]  M. H. Bakalla Arabic Culture: Through Its Language and Literature , 1984 .

[14]  Sid-Ahmed Selouani,et al.  Investigating speaker gender using rhythm metrics in Arabic dialects , 2013, 2013 8th International Workshop on Systems, Signal Processing and their Applications (WoSSPA).

[15]  Sid-Ahmed Selouani,et al.  Diacritization, automatic segmentation and labeling for Levantine Arabic speech , 2013, 2013 IEEE Digital Signal Processing and Signal Processing Education Meeting (DSP/SPE).

[16]  F. Ramus,et al.  Correlates of linguistic rhythm in the speech signal , 1999, Cognition.

[17]  Steve Young,et al.  The HTK hidden Markov model toolkit: design and philosophy , 1993 .

[18]  O. Al-Dakkak,et al.  Prosodic Feature Introduction and Emotion Incorporation in an Arabic TTS , 2006, 2006 2nd International Conference on Information & Communication Technologies.

[19]  Mark Hasegawa-Johnson,et al.  On the Definition of the Word "Segmental" , 2012 .

[20]  Mark Hasegawa-Johnson,et al.  Cross-Dialectal Data Transferring for Gaussian Mixture Model Training in Arabic Speech Recognition , 2012 .