Maghrebian dialect recognition based on support vector machines and neural network classifiers

This paper investigates the feed forward back propagation neural network (FFBPNN) and the support vector machine (SVM) for the classification of two Maghrebian dialects: Tunisian and Moroccan. The dialect used by the Moroccan speakers is called “La Darijja” and that of Tunisians is called “Darija”. An Automatic Speech Recognition System is implemented in order to identify ten Arabic digits (from zero to nine). The implementation of our present system consists of two phases: The features extraction using a variety of popular hybrid techniques and the classification phase using separately the FFBPNN and the SVM. The experimental results showed that the recognition rates with both approaches have reached 98.3 % with FFBPNN and 97.5 % with SVM.

[1]  Simon Haykin,et al.  Neural Networks and Learning Machines , 2010 .

[2]  I. Elamvazuthi,et al.  Voice Recognition Algorithms using Mel Frequency Cepstral Coefficient (MFCC) and Dynamic Time Warping (DTW) Techniques , 2010, ArXiv.

[3]  H Hermansky,et al.  Perceptual linear predictive (PLP) analysis of speech. , 1990, The Journal of the Acoustical Society of America.

[4]  Fatiha Sadat,et al.  Automatic Identification of Arabic Language Varieties and Dialects in Social Media , 2014, SocialNLP@COLING.

[5]  A. Srinivasan Speech Recognition Using Hidden Markov Model , 2011 .

[6]  Mark Hasegawa-Johnson,et al.  PLP coefficients can be quantized at 400 bps , 2001, 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.01CH37221).

[7]  Bernhard Schölkopf,et al.  A tutorial on support vector regression , 2004, Stat. Comput..

[8]  Giuliano Antoniol,et al.  Linear predictive coding and cepstrum coefficients for mining time variant information from software repositories , 2005, MSR.

[9]  Nizar Habash,et al.  Morphological Analysis and Disambiguation for Dialectal Arabic , 2013, NAACL.

[10]  Mohamed Hassine,et al.  Hybrid techniques for Arabic letter recognition , 2015 .

[11]  Fatiha Sadat,et al.  Automatic identification of arabic dialects in social media , 2014, SoMeRA@SIGIR.

[12]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[13]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[14]  Mona T. Diab,et al.  Sentence Level Dialect Identification in Arabic , 2013, ACL.

[15]  Nizar Habash,et al.  MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic , 2014, LREC.

[16]  Mona T. Diab,et al.  CODACT: Towards Identifying Orthographic Variants in Dialectal Arabic , 2011, IJCNLP.

[17]  Hwanjo Yu,et al.  SVM Tutorial - Classification, Regression and Ranking , 2012, Handbook of Natural Computing.

[18]  R. L. K. Venkateswarlu,et al.  Speech recognition using Radial Basis Function neural network , 2011, 2011 3rd International Conference on Electronics Computer Technology.

[19]  Adla Abdelkader,et al.  GMM-Based Maghreb Dialect IdentificationSystem , 2015 .

[20]  Fawzi Suliman Alorifi,et al.  Automatic Identification of Arabic Dialects USING Hidden Markov Models , 2008 .

[21]  Nizar Habash,et al.  Spoken Arabic Dialect Identification Using Phonotactic Modeling , 2009, SEMITIC@EACL.