Hierarchical Classification for Spoken Arabic Dialect Identification using Prosody: Case of Algerian Dialects

In daily communications, Arabs use local dialects which are hard to identify automatically using conventional classification methods. The dialect identification challenging task becomes more complicated when dealing with an under-resourced dialects belonging to a same county/region. In this paper, we start by analyzing statistically Algerian dialects in order to capture their specificities related to prosody information which are extracted at utterance level after a coarse-grained consonant/vowel segmentation. According to these analysis findings, we propose a Hierarchical classification approach for spoken Arabic algerian Dialect IDentification (HADID). It takes advantage from the fact that dialects have an inherent property of naturally structured into hierarchy. Within HADID, a top-down hierarchical classification is applied, in which we use Deep Neural Networks (DNNs) method to build a local classifier for every parent node into the hierarchy dialect structure. Our framework is implemented and evaluated on Algerian Arabic dialects corpus. Whereas, the hierarchy dialect structure is deduced from historic and linguistic knowledges. The results reveal that within {\HD}, the best classifier is DNNs compared to Support Vector Machine. In addition, compared with a baseline Flat classification system, our HADID gives an improvement of 63.5% in term of precision. Furthermore, overall results evidence the suitability of our prosody-based HADID for speaker independent dialect identification while requiring less than 6s test utterances.

[1]  Mervat Ibrahim The Arabic Language , 2012 .

[2]  Abderrahmane Amrouche,et al.  Algerian Modern Colloquial Arabic Speech Corpus (AMCASC): regional accents recognition within complex socio-linguistic environments , 2017, Lang. Resour. Evaluation.

[3]  Julia Hirschberg,et al.  Using prosody and phonotactics in Arabic dialect identification , 2009, INTERSPEECH.

[4]  Mansour Alsulaiman,et al.  KSU rich Arabic speech database , 2013 .

[5]  Stan Matwin,et al.  Learning and Evaluation in the Presence of Class Hierarchies: Application to Text Categorization , 2006, Canadian AI.

[6]  David Crystal,et al.  A dictionary of linguistics and phonetics , 1997 .

[7]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[8]  R. Hamdi La variation rythmique dans les dialectes arabes , 2007 .

[9]  Yun Lei,et al.  Study of Senone-Based Deep Neural Network Approaches for Spoken Language Recognition , 2016, IEEE/ACM Transactions on Audio, Speech, and Language Processing.

[10]  A. Mccarthy Development , 1996, Current Opinion in Neurobiology.

[11]  Rong Tong,et al.  Integrating Acoustic, Prosodic and Phonotactic Features for Spoken Language Identification , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[12]  Stephen Taylor,et al.  Palestinian Arabic regional accent recognition , 2015, 2015 International Conference on Speech Technology and Human-Computer Dialogue (SpeD).

[13]  James R. Glass,et al.  Automatic Dialect Detection in Arabic Broadcast Speech , 2015, INTERSPEECH.

[14]  E. Grabe,et al.  Durational variability in speech and the rhythm class hypothesis , 2005 .

[15]  Sid-Ahmed Selouani,et al.  Algerian Arabic rhythm classification , 2010, ExLing.

[16]  Salem Ghazali,et al.  Speech Rhythm Variation in Arabic Dialects , 2002 .

[17]  Daniel P. W. Ellis,et al.  Dialect and Accent Recognition Using Phonetic-Segmentation Supervectors , 2011, INTERSPEECH.

[18]  Alvin F. Martin,et al.  The 2011 NIST Language Recognition Evaluation , 2010, INTERSPEECH.

[19]  Fang Chen,et al.  Improvements on hierarchical language identification based on automatic language clustering , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Adla Abdelkader,et al.  GMM-Based Maghreb Dialect IdentificationSystem , 2015 .

[21]  Fawzi Suliman Alorifi,et al.  Automatic Identification of Arabic Dialects USING Hidden Markov Models , 2008 .

[22]  Bin Ma,et al.  Spoken Language Recognition: From Fundamentals to Practice , 2013, Proceedings of the IEEE.

[23]  Nizar Habash,et al.  Spoken Arabic Dialect Identification Using Phonotactic Modeling , 2009, SEMITIC@EACL.

[24]  Alex A. Freitas,et al.  A survey of hierarchical classification across different application domains , 2010, Data Mining and Knowledge Discovery.

[25]  Wajdi Zaghouani Critical Survey of the Freely Available Arabic Corpora , 2017, ArXiv.

[26]  Stefan Weninger,et al.  55. Arabic in the North African Region , 2011 .

[27]  Sarah L. Nesbeitt Ethnologue: Languages of the World , 1999 .

[28]  William M. Campbell,et al.  Discriminative n-gram selection for dialect recognition , 2009, INTERSPEECH.

[29]  K. Sreenivasa Rao,et al.  Language Identification Using Spectral and Prosodic Features , 2015 .

[30]  Dominique Caubet,et al.  Questionnaire de dialectologie du Maghreb (d'après les travaux de W. Marçais, M. Cohen, G.S. Colin, J. Cantineau, D. Cohen, Ph. Marçais, S. Lévy, etc.) , 2000 .

[31]  Vennila Ramalingam,et al.  A hierarchical language identification system for Indian languages , 2012, Digit. Signal Process..

[32]  Mohamed Embarki,et al.  Les dialectes arabes modernes : état et nouvelles perspectives pour la classification géo-sociologique , 2008 .

[33]  Inês Salselas,et al.  Music and speech in early development: automatic analysis and classification of prosodic features from two Portuguese variants , 2011 .

[34]  Alexander Gelbukh,et al.  Advances in Soft Computing and Its Applications , 2013, Lecture Notes in Computer Science.

[35]  Melissa Barkat-Defradas,et al.  Syllable structure in spoken Arabic: a comparative investigation , 2005, INTERSPEECH.

[36]  Mahmoud Al-Ayyoub,et al.  Spoken Arabic dialects identification: The case of Egyptian and Jordanian dialects , 2014, 2014 5th International Conference on Information and Communication Systems (ICICS).

[37]  Melissa Barkat-Defradas,et al.  Identification automatique des parlers arabes par la prosodie , 2006 .

[38]  Philippe Marçais,et al.  Esquisse grammaticale de l'arabe maghrébin , 1978 .

[39]  Haipeng Wang,et al.  A Hierarchical System Design for Language Identification , 2009, 2009 Second International Symposium on Information Science and Engineering.

[40]  V. Dellwo Rhythm and Speech Rate: A Variation Coefficient for deltaC , 2006 .

[41]  John J. Ohala,et al.  Prosody as a distinctive feature for the discrimination of arabic dialects , 1999, EUROSPEECH.

[42]  L William Zartman,et al.  Algeria , 1971, Africa Research Bulletin: Economic, Financial and Technical Series.

[43]  Mohamed Embarki,et al.  Contrastive focus and F0 patterns in three Arabic dialects , 2007 .

[44]  Joaquín González-Rodríguez,et al.  On the use of deep feedforward neural networks for automatic language identification , 2016, Comput. Speech Lang..

[45]  John C. Wells,et al.  Accents of English , 1982 .

[46]  Tingyao Wu Feature Selection in Speech and Speaker Recognition (Kenmerken selectie in spraak en spreker herkenning) , 2009 .

[47]  Dirk Van Compernolle,et al.  Under-resourced speech recognition based on the speech manifold , 2015, INTERSPEECH.

[48]  F. Ramus,et al.  Correlates of linguistic rhythm in the speech signal , 1999, Cognition.

[49]  Andreas Stolcke,et al.  Effective Arabic Dialect Classification Using Diverse Phonotactic Models , 2011, INTERSPEECH.

[50]  Author Not applicable Journal Information , 2018, Learn. Publ..

[51]  Marc A. Zissman,et al.  Automatic dialect identification of extemporaneous conversational, Latin American Spanish speech , 1996, 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings.

[52]  Salem Ghazali,et al.  Intonational and Rhythmic patterns across the Arabic Dialect continuum , 2005 .

[53]  François Pellegrino,et al.  Speech timing and rhythmic structure in arabic dialects: a comparison of two approaches , 2004, INTERSPEECH.