Systematic Literature Review of Dialectal Arabic: Identification and Detection

It is becoming increasingly difficult to know who is working on what and how in computational studies of Dialectal Arabic. This study comes to chart the field by conducting a systematic literature review that is intended to give insight into the most and least popular research areas, dialects, machine learning approaches, neural network input features, data types, datasets, system evaluation criteria, publication venues, and publication trends. It is a review that is guided by the norms of systematic reviews. It has taken account of all the research that adopted a computational approach to dialectal Arabic identification and detection and that was published between 2000 and 2020. It collected, analyzed, and collated this research, discovered its trends, and identified research gaps. It revealed, inter alia, that our research effort has not been directed evenly between speech and text or between the vernaculars; there is some bias favoring text over speech, regional varieties over individual vernaculars, and Egyptian over all other vernaculars. Furthermore, there is a clear preference for shallow machine learning approaches, for the use of n-grams, TF-IDF, and MFCC as neural network features, and for accuracy as a statistical measure of validation of results. This paper also pointed to some glaring gaps in the research: (1) total neglect of Mauritanian and Bahraini in the continuous Arabic language area and of such enclave varieties as Anatolian Arabic, Khuzistan Arabic, Khurasan Arabic, Uzbekistan Arabic, the Subsaharan Arabic of Nigeria and Chad, Djibouti Arabic, Cypriot Arabic and Maltese; (2) scarcity of city dialect resources; (3) rarity of linguistic investigations that would complement our research; (4) and paucity of deep machine learning experimentation.

[1]  Mohamed Ali,et al.  Character Level Convolutional Neural Network for Arabic Dialect Identification , 2018, VarDial@COLING 2018.

[2]  Aziz Qaroush,et al.  Automatic Spoken Customer Query Identification for Arabic Language , 2016, ICIME 2016.

[3]  Clare R. Voss,et al.  Finding Romanized Arabic Dialect in Code-Mixed Tweets , 2014, LREC.

[4]  Antonio Jimeno-Yepes,et al.  Investigating Public Health Surveillance using Twitter , 2015, BioNLP@IJCNLP.

[5]  Lamia Hadrich Belguith,et al.  Building bilingual lexicon to create Dialect Tunisian corpora and adapt language model , 2013, HyTra@ACL.

[6]  Shervin Malmasi,et al.  Arabic Dialect Identification Using iVectors and ASR Transcripts , 2017, VarDial.

[7]  Elsayed E. Hemayed,et al.  Gender identification of egyptian dialect in twitter , 2019 .

[8]  Mahmoud El-Haj,et al.  Habibi - a multi Dialect multi National Arabic Song Lyrics Corpus , 2020, LREC.

[9]  Ming Wen,et al.  Building a National Neighborhood Dataset From Geotagged Twitter Data for Indicators of Happiness, Diet, and Physical Activity , 2016, JMIR public health and surveillance.

[10]  Yun Lei,et al.  Dialect identification: Impact of differences between read versus spontaneous speech , 2010, 2010 18th European Signal Processing Conference.

[11]  Sid-Ahmed Selouani,et al.  Speaker environment classification using rhythm metrics in Levantine Arabic dialect , 2014, 2014 9th International Symposium on Communication Systems, Networks & Digital Sign (CSNDSP).

[12]  David Graff,et al.  Developing LMF-XML Bilingual Dictionaries for Colloquial Arabic Dialects , 2012, LREC.

[13]  Mourad Abbas,et al.  Word-Level vs Sentence-Level Language Identification: Application to Algerian and Arabic Dialects , 2018, ACLING.

[14]  Kareem Darwish,et al.  Using Twitter to Collect a Multi-Dialectal Corpus of Arabic , 2014, ANLP@EMNLP.

[15]  Diglossia , 2019, The SAGE Encyclopedia of Human Communication Sciences and Disorders.

[16]  Ahmed Abdelali,et al.  QADI: Arabic Dialect Identification in the Wild , 2020, WANLP.

[17]  Joon Huang Chuah,et al.  Spoken Arabic Digits Recognition Using Deep Learning , 2019, 2019 IEEE International Conference on Automatic Control and Intelligent Systems (I2CACIS).

[18]  Shervin Malmasi,et al.  Arabic Dialect Identification in Speech Transcripts , 2016, VarDial@COLING.

[19]  Fayez A. Alhargan,et al.  Saudi accented Arabic voice bank , 2008, ExLing.

[20]  Karima Meftouh,et al.  Building resources for Algerian Arabic dialects , 2014, INTERSPEECH.

[21]  John H. L. Hansen,et al.  Semi-supervised Learning with Generative Adversarial Networks for Arabic Dialect Identification , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[22]  John H. L. Hansen,et al.  UTD-CRSS submission for MGB-3 Arabic dialect identification: Front-end and back-end advancements on broadcast speech , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[23]  Mourad Abbas,et al.  Building a Speech Corpus based on Arabic Podcasts for Language and Dialect Identification , 2019, ICNLSP.

[24]  Karima Meftouh,et al.  Maghrebi Arabic dialect processing: an overview , 2017 .

[25]  Muna S. Al-Razgan,et al.  Arabic Text Mining a Systematic Review of the Published Literature 2002-2014 , 2015, 2015 International Conference on Cloud Computing (ICCC).

[26]  Karima Meftouh,et al.  Machine Translation Experiments on PADIC: A Parallel Arabic DIalect Corpus , 2015, PACLIC.

[27]  E. Atwell,et al.  Classifying Arabic dialect text in the Social Media Arabic Dialect Corpus (SMADC) , 2019 .

[28]  Wajdi Zaghouani Critical Survey of the Freely Available Arabic Corpora , 2017, ArXiv.

[29]  Abdulhadi Shoufan,et al.  Natural Language Processing for Dialectical Arabic: A Survey , 2015, ANLP@ACL.

[30]  Eiichiro Sumita,et al.  Multilingual Spoken Language Corpus Development for Communication Research , 2006, ROCLING/IJCLCLP.

[31]  Volker Dellwo,et al.  Arabic Speech Rhythm Corpus: Read and Spontaneous Speaking Styles , 2020, LREC.

[32]  Reem AlYami,et al.  Arabic Dialect Identification in Social Media , 2020, 2020 3rd International Conference on Computer Applications & Information Security (ICCAIS).

[33]  Mark Hasegawa-Johnson,et al.  A Transfer Learning Approach for Under-Resourced Arabic Dialects Speech Recognition , 2013 .

[34]  Mahmoud Al-Ayyoub,et al.  Team JUST at the MADAR Shared Task on Arabic Fine-Grained Dialect Identification , 2019, WANLP@ACL 2019.

[35]  Soumia Bougrine,et al.  Toward a Web-based Speech Corpus for Algerian Dialectal Arabic Varieties , 2017, WANLP@EACL.

[36]  O V Durandin,et al.  Automatic Arabic Dialect Classification , 2016 .

[37]  D. Moher,et al.  Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.

[38]  Mohamed Hassine,et al.  Maghrebian dialect recognition based on support vector machines and neural network classifiers , 2016, Int. J. Speech Technol..

[39]  Manish Kumar,et al.  Recent Named Entity Recognition and Classification techniques: A systematic review , 2018, Comput. Sci. Rev..

[40]  Janet C. E. Watson,et al.  The semitic languages : an international handbook , 2011 .

[41]  Ismail Shahin,et al.  Emarati speaker identification , 2014, 2014 12th International Conference on Signal Processing (ICSP).

[42]  Kemal Oflazer,et al.  The MADAR Arabic Dialect Corpus and Lexicon , 2018, LREC.

[43]  Julia Hirschberg,et al.  Using prosody and phonotactics in Arabic dialect identification , 2009, INTERSPEECH.

[44]  J. Owens A Linguistic History of Arabic , 2006 .

[45]  Nizar Habash,et al.  ADIDA: Automatic Dialect Identification for Arabic , 2019, NAACL.

[46]  Hagen Soltau,et al.  Discriminative Phonotactics for Dialect Recognition Using Context-Dependent Phone Classifiers , 2010, Odyssey.

[47]  Karim Bouzoubaa,et al.  Automatic Identification of Moroccan Colloquial Arabic , 2017, ICALP.

[48]  Radu Tudor Ionescu,et al.  UnibucKernel Reloaded: First Place in Arabic Dialect Identification for the Second Year in a Row , 2018, VarDial@COLING 2018.

[49]  Yun Lei,et al.  Factor analysis-based information integration for Arabic dialect identification , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[50]  Enrique Frías-Martínez,et al.  Comparing and modelling land use organization in cities , 2015, Royal Society Open Science.

[51]  Ismail Shahin,et al.  Emotion Recognition Using Hybrid Gaussian Mixture Model and Deep Neural Network , 2019, IEEE Access.

[52]  Nizar Habash,et al.  Curras: an annotated corpus for the Palestinian Arabic dialect , 2017, Lang. Resour. Evaluation.

[53]  Mona T. Diab,et al.  A Framework for the Classification and Annotation of Multiword Expressions in Dialectal Arabic , 2014, ANLP@EMNLP.

[54]  Samantha Wray Classification of Closely Related Sub-dialects of Arabic Using Support-Vector Machines , 2018, LREC.

[55]  Fatiha Sadat,et al.  Automatic identification of arabic dialects in social media , 2014, SoMeRA@SIGIR.

[56]  Nizar Habash,et al.  Arabic Dialect Processing Tutorial , 2012, HLT-NAACL.

[57]  Damien Nouvel,et al.  Arabic natural language processing: An overview , 2019, J. King Saud Univ. Comput. Inf. Sci..

[58]  Viveka Velupillai,et al.  An Introduction to Linguistic Typology , 2012 .

[59]  Nizar Habash,et al.  ADAM: Analyzer for Dialectal Arabic Morphology , 2014, J. King Saud Univ. Comput. Inf. Sci..

[60]  Janet Watson,et al.  Arabic dialects (general article) , 2011 .

[61]  A. Elnagar,et al.  Hotel Arabic-Reviews Dataset Construction for Sentiment Analysis Applications , 2018 .

[62]  Ismail Shahin,et al.  Emirati-accented speaker identification in each of neutral and shouted talking environments , 2018, Int. J. Speech Technol..

[63]  Muhammad Abdul-Mageed,et al.  Recognizing Pathogenic Empathy in Social Media , 2017, ICWSM.

[64]  Timothy Baldwin,et al.  Automatic Language Identification in Texts: A Survey , 2018, J. Artif. Intell. Res..

[65]  James R. Glass,et al.  Exploiting Convolutional Neural Networks for Phonotactic Based Dialect Identification , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[66]  Radu Tudor Ionescu,et al.  UnibucKernel: An Approach for Arabic Dialect Identification Based on Multiple String Kernels , 2016, VarDial@COLING.

[67]  Hussein T. Al-Natsheh,et al.  Mawdoo3 AI at MADAR Shared Task: Arabic Fine-Grained Dialect Identification with Ensemble Learning , 2019, WANLP@ACL 2019.

[68]  Alexander Erdmann,et al.  Addressing Noise in Multidialectal Word Embeddings , 2018, ACL.

[69]  Kamel Smaïli,et al.  CALYOU: A Comparable Spoken Algerian Corpus Harvested from YouTube , 2017, INTERSPEECH.

[70]  K. Almeman,et al.  Automatic building of Arabic multi dialect text corpora by bootstrapping dialect words , 2013, 2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA).

[71]  Mahmoud El-Haj,et al.  Arabic Dialect Identification in the Context of Bivalency and Code-Switching , 2018, LREC.

[72]  S. Khudanpur,et al.  Translations of the Callhome Egyptian Arabic corpus for conversational speech translation , 2014, IWSLT.

[73]  Stergios Chatzikyriakidis,et al.  Shami: A Corpus of Levantine Arabic Dialects , 2018, LREC.

[74]  James R. Glass,et al.  Automatic Dialect Detection in Arabic Broadcast Speech , 2015, INTERSPEECH.

[75]  Leila Beltaifa-Zouari,et al.  SPEAKER RECOGNITION OF MAGHREB DIALECTS , 2017 .

[76]  Ryan Cotterell,et al.  A Multi-Dialect, Multi-Genre Corpus of Informal Written Arabic , 2014, LREC.

[77]  Nizar Habash,et al.  Morphophonemic and orthographic rules in a multi-dialectal morphological analyzer and generator for Arabic verbs , 2007 .

[78]  Gregory Epiphaniou,et al.  Classification of colloquial Arabic tweets in real-time to detect high-risk floods , 2017, 2017 International Conference On Social Media, Wearable And Web Analytics (Social Media).

[79]  Sameer Khurana,et al.  QCRI advanced transcription system (QATS) for the Arabic Multi-Dialect Broadcast media recognition: MGB-2 challenge , 2016, 2016 IEEE Spoken Language Technology Workshop (SLT).

[80]  Roxana Girju,et al.  YADAC: Yet another Dialectal Arabic Corpus , 2012, LREC.

[81]  Vítor Costa,et al.  Knowledge Processes, Absorptive Capacity and Innovation: A Mediation Analysis , 2016 .

[82]  Josep Maria Crego,et al.  Neural Network Architectures for Arabic Dialect Identification , 2018, VarDial@COLING 2018.

[83]  K. Almeman,et al.  Multi dialect Arabic speech parallel corpora , 2013, 2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA).

[84]  Ibraheem Tuffaha,et al.  Multi-dialect Arabic BERT for Country-level Dialect Identification , 2020, WANLP.

[85]  Mona T. Diab,et al.  Simplified guidelines for the creation of Large Scale Dialectal Arabic Annotations , 2012, LREC.

[86]  Chris Callison-Burch,et al.  Arabic Dialect Identification , 2014, CL.

[87]  Bo Isaksson Arabic Dialectology: The State of the Art : Review of Dialectologia Arabica: A Collection of Articles in Honour of the Sixtieth Birthday of Professor Heikki Palva , 1996 .

[88]  Mona T. Diab,et al.  A Web Application for Dialectal Arabic Text Annotation , 2011 .

[89]  Fatma Zohra Belkredim,et al.  Arabic Algerian Oranee Dialectal Language Modelling Oriented Topic , 2019 .

[90]  Muhammad Abdul-Mageed,et al.  Subjectivity and sentiment analysis of Arabic as a morophologically-rich language , 2015 .

[91]  Clare R. Voss,et al.  Tweet Conversation Annotation Tool with a Focus on an Arabic Dialect, Moroccan Darija , 2013, LAW@ACL.

[92]  Khalid Almeman,et al.  Automatically Building VoIP Speech Parallel Corpora for Arabic Dialects , 2017, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[93]  Kamel Smaïli,et al.  Development of the Arabic Loria Automatic Speech Recognition system (ALASR) and its evaluation for Algerian dialect , 2017, ACLING.

[94]  Tamer Elsayed,et al.  DART: A Large Dataset of Dialectal Arabic Tweets , 2018, LREC.

[95]  Lamia Hadrich Belguith,et al.  LEXICAL STUDY OF A SPOKEN DIALOGUE CORPUS IN TUNISIAN DIALECT , 2010 .

[96]  Gaël Lejeune,et al.  MICHAEL: Mining Character-level Patterns for Arabic Dialect Identification (MADAR Challenge) , 2019, WANLP@ACL 2019.

[97]  Ali Mansour,et al.  Arabic text classification methods: Systematic literature review of primary studies , 2016, 2016 4th IEEE International Colloquium on Information Science and Technology (CiSt).

[98]  Andreas Stolcke,et al.  Effective Arabic Dialect Classification Using Diverse Phonotactic Models , 2011, INTERSPEECH.

[99]  Nizar Habash,et al.  A Large Scale Corpus of Gulf Arabic , 2016, LREC.

[100]  Rim Faiz,et al.  Tunisian dialect Wordnet creation and enrichment using web resources and other Wordnets , 2014, ANLP@EMNLP.

[101]  Chris Callison-Burch,et al.  Machine Translation of Arabic Dialects , 2012, NAACL.

[102]  Clive Holes,et al.  Modern Arabic: Structures, Functions, and Varieties , 1996 .

[103]  Chris Callison-Burch,et al.  The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content , 2011, ACL.

[104]  Hadhemi Achour,et al.  Constructing Linguistic Resources for the Tunisian Dialect Using Textual User-Generated Contents on the Social Web , 2015, ICWE Workshops.

[105]  Mahmoud Al-Ayyoub,et al.  Spoken Arabic dialects identification: The case of Egyptian and Jordanian dialects , 2014, 2014 5th International Conference on Information and Communication Systems (ICICS).

[106]  Nizar Habash,et al.  Spoken Arabic Dialect Identification Using Phonotactic Modeling , 2009, SEMITIC@EACL.

[107]  Lamiaa Abdel-Hamid,et al.  Egyptian Arabic speech emotion recognition using prosodic, spectral and wavelet features , 2020, Speech Commun..

[108]  Roxana Girju,et al.  Mining the Web for the Induction of a Dialectical Arabic Lexicon , 2010, LREC.

[109]  E. Atwell,et al.  A Social Media Corpus of Arabic Dialect Text , 2019 .

[110]  Marwan Torki,et al.  Arabic Dialect Identification with Deep Learning and Hybrid Frequency Based Features , 2019, WANLP@ACL 2019.

[111]  Laurent Besacier,et al.  Automatic Identification of Arabic Dialects , 2010, LREC.

[112]  Karima Meftouh,et al.  Automatic Identification Methods on a Corpus of Twenty Five Fine-Grained Arabic Dialects , 2019, ICALP.

[113]  Jiajun Liu,et al.  Understanding Human Mobility from Twitter , 2014, PloS one.

[114]  Karima Meftouh,et al.  PADIC: extension and new experiments , 2018 .

[115]  Wajdi Zaghouani,et al.  Arap-Tweet: A Large Multi-Dialect Twitter Corpus for Gender, Age and Language Variety Identification , 2018, LREC.

[116]  Nizar Habash,et al.  Developing an Egyptian Arabic Treebank: Impact of Dialectal Morphology on Annotation and Tool Development , 2014, LREC.

[117]  Mohsen Moftah,et al.  Arabic dialect identification based on motif discovery using GMM-UBM with different motif lengths , 2018, 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP).

[118]  G. Antes,et al.  Five Steps to Conducting a Systematic Review , 2003, Journal of the Royal Society of Medicine.

[119]  Nizar Habash,et al.  MAGEAD: A Morphological Analyzer and Generator for the Arabic Dialects , 2006, ACL.

[120]  Yutaka Matsuo,et al.  Earthquake shakes Twitter users: real-time event detection by social sensors , 2010, WWW '10.

[121]  John H. L. Hansen,et al.  Supervector pre-processing for PRSVM-based Chinese and Arabic dialect identification , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[122]  Mona T. Diab,et al.  AIDA2: A Hybrid Approach for Token and Sentence Level Dialect Identification in Arabic , 2015, CoNLL.

[123]  Muhammad Abdul-Mageed,et al.  You Tweet What You Speak: A City-Level Dataset of Arabic Dialects , 2018, LREC.

[124]  Luis Gutiérrez,et al.  A Systematic Literature Review on Word Embeddings , 2018, Advances in Intelligent Systems and Computing.

[125]  P. Swiggers,et al.  The Collected Works of Edward Sapir , 1989 .

[126]  Khalid Almeman The Building and Evaluation of a Mobile Parallel Multi-Dialect Speech Corpus for Arabic , 2018, ACLING.

[127]  Stephan Vogel,et al.  Advances in dialectal Arabic speech recognition: a study using Twitter to improve Egyptian ASR , 2014, IWSLT.

[128]  Stephen C Tratz Accurate Arabic Script Language/Dialect Classification , 2014 .

[129]  Fahim Dalvi,et al.  QCRI @ DSL 2016: Spoken Arabic Dialect Identification Using Textual Features , 2016, VarDial@COLING.

[130]  Pearl Brereton,et al.  Performing systematic literature reviews in software engineering , 2006, ICSE.

[131]  Vandan Mujadia,et al.  Arabic Dialect Identification for Travel and Twitter Text , 2019, WANLP@ACL 2019.

[132]  Nizar Habash,et al.  Developing and Using a Pilot Dialectal Arabic Treebank , 2006, LREC.

[133]  Albino Nogueiras,et al.  OrienTel - Telephony Databases Across Northern Africa and the Middle East , 2004, LREC.

[134]  Clive Holes,et al.  Dialect, Culture, and Society in Eastern Arabia , 2000 .

[135]  Ming Wen,et al.  Twitter-derived neighborhood characteristics associated with obesity and diabetes , 2017, Scientific Reports.

[136]  Fethi Bougares,et al.  Text and Speech-based Tunisian Arabic Sub-Dialects Identification , 2020, LREC.

[137]  Ashraf Elnagar,et al.  BRAD 1.0: Book reviews in Arabic dataset , 2016, 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA).

[138]  Carlo Ratti,et al.  Geo-located Twitter as proxy for global mobility patterns , 2013, Cartography and geographic information science.

[139]  Lamia Hadrich Belguith,et al.  Morphological Analysis of Tunisian Dialect , 2013, IJCNLP.

[140]  Fatiha Sadat,et al.  Automatic Identification of Arabic Language Varieties and Dialects in Social Media , 2014, SocialNLP@COLING.

[141]  Naomie Salim,et al.  Opinion analysis for twitter and arabic tweets: a systematic literature review , 2013 .

[142]  Nizar Habash,et al.  Introduction to Arabic Natural Language Processing , 2010, Introduction to Arabic Natural Language Processing.

[143]  Stephen Taylor,et al.  Classifying ASR Transcriptions According to Arabic Dialect , 2016, VarDial@COLING.

[144]  James R. Glass,et al.  ADI17: A Fine-Grained Arabic Dialect Identification Dataset , 2020, ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[145]  Nizar Habash,et al.  Parsing Arabic Dialects , 2006, EACL.

[146]  Karima Meftouh,et al.  The SMarT Classifier for Arabic Fine-Grained Dialect Identification , 2019, WANLP@ACL 2019.

[147]  Bezoui Mouaz,et al.  Speech Recognition of Moroccan Dialect Using Hidden Markov Models , 2019 .

[148]  Karim Bouzoubaa,et al.  Building a Moroccan dialect electronic Dictionary (MDED) , 2014 .

[149]  Rainer Gruhn,et al.  Novel Techniques for Dialectal Arabic Speech Recognition , 2012 .

[150]  Samantha Wray,et al.  Crowdsource a little to label a lot: labeling a speech corpus of dialectal Arabic , 2015, INTERSPEECH.

[151]  Ismail Shahin,et al.  Emirati-Accented Speaker Identification in Stressful Talking Conditions , 2019, 2019 International Conference on Electrical and Computing Technologies and Applications (ICECTA).

[152]  E. Atwell,et al.  Creating an Arabic Dialect Text Corpus by Exploring Twitter, Facebook, and Online Newspapers , 2018 .

[153]  Motaz Saad,et al.  WikiDocsAligner: An Off-the-Shelf Wikipedia Documents Alignment Tool , 2017, 2017 Palestinian International Conference on Information and Communication Technology (PICICT).

[154]  Khaled Shaalan,et al.  Arabic Natural Language Processing: Challenges and Solutions , 2009, TALIP.

[155]  Mona Abdullah Al-Walaie,et al.  Arabic dialects classification using text mining techniques , 2017, 2017 International Conference on Computer and Applications (ICCA).

[156]  Heba Elfardy,et al.  AIDA: Automatic Identification and Glossing of Dialectal Arabic , 2012, EAMT.

[157]  Wajdi Zaghouani,et al.  Building a Corpus of Qatari Arabic Expressions , 2020, OSACT.

[158]  Sprach und Literaturwissenschaft Modern Standard Arabic , 2010 .

[159]  Mona T. Diab,et al.  Sentence Level Dialect Identification in Arabic , 2013, ACL.

[160]  Hassanin M. Al-Barhamtoshy,et al.  Arabic Spoken Language Identification System (ASLIS): A Proposed System to Identifying Modern Standard Arabic (MSA) and Egyptian Dialect , 2011 .

[161]  Shereen ElSayed,et al.  Gender identification for Egyptian Arabic dialect in twitter using deep learning models , 2020 .

[162]  Matthew Lease,et al.  ArabicWeb16: A New Crawl for Today's Arabic Web , 2016, SIGIR.

[163]  Karim Bouzoubaa,et al.  Bootstrapping a WordNet for an Arabic dialect from other WordNets and dictionary resources , 2013, 2013 ACS International Conference on Computer Systems and Applications (AICCSA).

[164]  Eric Atwell,et al.  Arabic dialects annotation using an online game , 2018, 2018 2nd International Conference on Natural Language and Speech Processing (ICNLSP).

[165]  Djelloul Ziadi,et al.  Prosody-based Spoken Algerian Arabic Dialect Identification , 2015, ICNLSP.

[166]  Kevin Duh,et al.  Lexicon Acquisition for Dialectal Arabic Using Transductive Learning , 2006, EMNLP.

[167]  MATHEUS ARAUJO,et al.  A comparative study of machine translation for multilingual sentence-level sentiment analysis , 2020, Inf. Sci..

[168]  Dimitra Vergyri,et al.  Cross-dialectal data sharing for acoustic modeling in Arabic speech recognition , 2005, Speech Commun..

[169]  Faïez Gargouri,et al.  Graphical Models for Multi-dialect Arabic Isolated Words Recognition , 2015, KES.

[170]  C. Anton Rytting,et al.  Spelling Correction for Dialectal Arabic Dictionary Lookup , 2011, TALIP.

[171]  Fatemah Husain,et al.  Arabic Offensive Language Detection Using Machine Learning and Ensemble Machine Learning Approaches , 2020, ArXiv.

[172]  Alexander Erdmann,et al.  Noise-Robust Morphological Disambiguation for Dialectal Arabic , 2018, NAACL.

[173]  Andrea Sansò,et al.  MED-TYP: A Typological Database for Mediterranean Languages , 2004, LREC.

[174]  Abdessalam Bouchekif,et al.  Hierarchical Deep Learning for Arabic Dialect Identification , 2019, WANLP@ACL 2019.

[175]  Hassan Sajjad,et al.  Verifiably Effective Arabic Dialect Identification , 2014, EMNLP.

[176]  Khaled Alrifai,et al.  Arabic Tweeps Gender and Dialect Prediction , 2017, CLEF.

[177]  Ismail Shahin Text-Independent Emirati-Accented Speaker Identification in Emotional Talking Environment , 2018, 2018 Fifth HCT Information Technology Trends (ITT).

[178]  Kemal Oflazer,et al.  A Multidialectal Parallel Corpus of Arabic , 2014, LREC.

[179]  Mohamed Hassine,et al.  Tunisian dialect recognition based on hybrid techniques , 2018, Int. Arab J. Inf. Technol..

[180]  David Graff,et al.  Lexicon Development for Varieties of Spoken Colloquial Arabic , 2006, LREC.

[181]  Stephen Taylor,et al.  ZCU-NLP at MADAR 2019: Recognizing Arabic Dialects , 2019, WANLP@ACL 2019.

[182]  Muhammad Abdul-Mageed,et al.  Deep Models for Arabic Dialect Identification on Benchmarked Data , 2018, VarDial@COLING 2018.

[183]  Nora Al-Twairesh,et al.  SUAR: Towards Building a Corpus for the Saudi Dialect , 2018, ACLING.

[184]  Eric Atwell,et al.  Compression versus traditional machine learning classifiers to detect code-switching in varieties and dialects: Arabic as a case study , 2020, Natural Language Engineering.

[185]  Meshrif Alruily Issues of dialectal saudi twitter corpus , 2020, Int. Arab J. Inf. Technol..

[187]  Kevin Duh,et al.  POS Tagging of Dialectal Arabic: A Minimally Supervised Approach , 2005, SEMITIC@ACL.

[188]  Hagen Soltau,et al.  From Modern Standard Arabic to Levantine ASR: Leveraging GALE for dialects , 2011, 2011 IEEE Workshop on Automatic Speech Recognition & Understanding.

[189]  Olaronke Iroju,et al.  A Systematic Review of Natural Language Processing in Healthcare , 2015 .

[190]  James R. Glass,et al.  Convolutional Neural Networks and Language Embeddings for End-to-End Dialect Recognition , 2018, Odyssey.

[191]  Laura Kallmeyer,et al.  A Neural Architecture for Dialectal Arabic Segmentation , 2017, WANLP@EACL.

[192]  Nizar Habash,et al.  NADI 2020: The First Nuanced Arabic Dialect Identification Shared Task , 2020, WANLP.

[193]  Lamia Hadrich Belguith,et al.  Transliteration of Arabizi into Arabic Script for Tunisian Dialect , 2020, ACM Trans. Asian Low Resour. Lang. Inf. Process..

[194]  H. Sawaf Arabic Dialect Handling in Hybrid Machine Translation , 2010, AMTA.

[195]  Mona T. Diab,et al.  COLABA : Arabic Dialect Annotation and Processing , 2011 .

[196]  Karima Meftouh,et al.  Cross-Dialectal Arabic Processing , 2015, CICLing.

[197]  Slim Abdennadher,et al.  Modern standard Arabic based multilingual approach for dialectal Arabic speech recognition , 2009, 2009 Eighth International Symposium on Natural Language Processing.

[198]  Ashraf Elnagar,et al.  Automatic Arabic Dialect Classification Using Deep Learning Models , 2018, ACLING.

[199]  Nizar Habash,et al.  Elissa: A Dialectal to Standard Arabic Machine Translation System , 2012, COLING.

[200]  James R. Glass,et al.  MIT-QCRI Arabic dialect identification system for the 2017 multi-genre broadcast challenge , 2017, 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[201]  Yaser Al-Onaizan,et al.  Improved Sentence-Level Arabic Dialect Classification , 2014, VarDial@COLING.

[202]  Richard Johansson,et al.  Automatic Detection of Arabicized Berber and Arabic Varieties , 2016, VarDial@COLING.