DALILA: The Dialectal Arabic Linguistic Learning Assistant

Dialectal Arabic (DA) poses serious challenges for Natural Language Processing (NLP). The number and sophistication of tools and datasets in DA are very limited in comparison to Modern Standard Arabic (MSA) and other languages. MSA tools do not effectively model DA which makes the direct use of MSA NLP tools for handling dialects impractical. This is particularly a challenge for the creation of tools to support learning Arabic as a living language on the web, where authentic material can be found in both MSA and DA. In this paper, we present the Dialectal Arabic Linguistic Learning Assistant (DALILA), a Chrome extension that utilizes cutting-edge Arabic dialect NLP research to assist learners and non-native speakers in understanding text written in either MSA or DA. DALILA provides dialectal word analysis and English gloss corresponding to each word.

[1]  Nizar Habash,et al.  LDC Arabic Treebanks and Associated Corpora: Data Divisions Manual , 2013, ArXiv.

[2]  Nizar Habash,et al.  A Large Scale Corpus of Gulf Arabic , 2016, LREC.

[3]  Claudia Fernández,et al.  CALL Dimensions: Options and Issues in Computer-Assisted Language Learning by LEVY, MIKE, & GLENN STOCKWELL , 2007 .

[4]  Nizar Habash,et al.  Tharwa: A Large Scale Dialectal Arabic - Standard Arabic - English Lexicon , 2014, LREC.

[5]  Nizar Habash,et al.  MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic , 2014, LREC.

[6]  Carol A. Chapelle,et al.  Computer-assisted language learning , 2010 .

[7]  Nizar Habash,et al.  Introduction to Arabic Natural Language Processing , 2010, Introduction to Arabic Natural Language Processing.

[8]  C. Holes VARIATION IN THE MORPHOPHONOLOGY OF ARABIC DIALECTS , 1986 .

[9]  Nizar Habash,et al.  Arabic Morphological Representations for Machine Translation , 2007 .

[10]  K. Brustad The Syntax of Spoken Arabic: A Comparative Study of Moroccan, Egyptian, Syrian, and Kuwaiti Dialects. , 2002 .

[11]  Nizar Habash,et al.  Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic , 2016, LREC.

[12]  Nizar Habash,et al.  Morphological Analysis and Disambiguation for Dialectal Arabic , 2013, NAACL.

[13]  J. McCarthy The phonology and morphology of Arabic , 2004 .

[14]  Kemal Oflazer,et al.  A Multidialectal Parallel Corpus of Arabic , 2014, LREC.

[15]  M. Maamouri,et al.  Creating a Methodology for Large-Scale Correction of Treebank Annotation : The Case of the Arabic Treebank , 2009 .

[16]  Abdulhadi Shoufan,et al.  Natural Language Processing for Dialectical Arabic: A Survey , 2015, ANLP@ACL.

[17]  Nizar Habash,et al.  MAGEAD: A Morphological Analyzer and Generator for the Arabic Dialects , 2006, ACL.

[18]  Seth Kulick,et al.  Diacritization: A Challenge to Arabic Treebank Annotation and Parsing , 2006, BCS.

[19]  Chris Callison-Burch,et al.  The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content , 2011, ACL.

[20]  Fatiha Sadat,et al.  Automatic identification of arabic dialects in social media , 2014, SoMeRA@SIGIR.

[21]  Nizar Habash,et al.  Automatic Transliteration of Romanized Dialectal Arabic , 2014, CoNLL.

[22]  Clive Holes,et al.  Modern Arabic: Structures, Functions, and Varieties , 1996 .

[23]  Kemal Oflazer,et al.  YouDACC: the Youtube Dialectal Arabic Comment Corpus , 2014, LREC.

[24]  Chris Callison-Burch,et al.  Machine Translation of Arabic Dialects , 2012, NAACL.

[25]  Fei Huang Improved Arabic Dialect Classification with Social Media Data , 2015, EMNLP.

[26]  M. Maamouri,et al.  The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .

[27]  Mona T. Diab,et al.  AIDA: Identifying Code Switching in Informal Arabic Text , 2014, CodeSwitch@EMNLP.

[28]  Yonatan Belinkov,et al.  Translating Dialectal Arabic to English , 2013, ACL.

[29]  Owen Rambow,et al.  DIWAN: A Dialectal Word Annotation Tool for Arabic , 2015, ANLP@ACL.

[30]  Nizar Habash,et al.  50th Annual Meeting of the Association for Computational Linguistics Proceedings of the Conference Volume 2: Short Papers , 2012 .

[31]  Nizar Habash,et al.  On Arabic Transliteration , 2007 .

[32]  Nizar Habash,et al.  Dialectal to Standard Arabic Paraphrasing to Improve Arabic-English Statistical Machine Translation , 2011, EMNLP 2011.