Lexical differences and similarities between Moroccan dialect and Arabic

This paper describes the relationship existing between Moroccan dialect (MD) and Arabic language through a comparative study to prove that those languages have a lot in common at the lexical level. For this purpose, we used `MDED' an MD electronic lexicon containing 15000 entries that we have built in a previous work versus Modern Standard Arabic (MSA). Among MDED origins (MSA, French, Spanish, Tamazight and unknown), we focused on the unknown origin in order to perform a survey comparing MDED's content with data collected from some Arabic lexicons. The survey consists of a semiautomatic evaluation showing that about 39% of words having unknown origin existing in the MDED lexicon have been derived from MSA, so that almost 86% of MD lexical content is inspired from Arabic.

[1]  Nizar Habash,et al.  MADAMIRA: A Fast, Comprehensive Tool for Morphological Analysis and Disambiguation of Arabic , 2014, LREC.

[2]  Chris Callison-Burch,et al.  The Arabic Online Commentary Dataset: an Annotated Dataset of Informal Arabic with High Dialectal Content , 2011, ACL.

[3]  Nizar Habash,et al.  A Conventional Orthography for Algerian Arabic , 2015, ANLP@ACL.

[4]  Karima Meftouh,et al.  A study of a non-resourced language: an Algerian dialect , 2012, SLTU.

[5]  Karima Meftouh,et al.  Machine Translation Experiments on PADIC: A Parallel Arabic DIalect Corpus , 2015, PACLIC.

[6]  Roxana Girju,et al.  Mining the Web for the Induction of a Dialectical Arabic Lexicon , 2010, LREC.

[7]  Karim Bouzoubaa,et al.  Building a Moroccan dialect electronic Dictionary (MDED) , 2014 .

[8]  Owen Rambow,et al.  DIWAN: A Dialectal Word Annotation Tool for Arabic , 2015, ANLP@ACL.

[9]  A. BOUDLAL,et al.  A Morphosyntactic analysis system for Arabic texts , 2010 .

[10]  Silvia Bernardini,et al.  BootCaT: Bootstrapping Corpora and Terms from the Web , 2004, LREC.

[11]  Karim Bouzoubaa,et al.  A hybrid approach to translate Moroccan Arabic dialect , 2014, 2014 9th International Conference on Intelligent Systems: Theories and Applications (SITA-14).

[12]  K. Almeman,et al.  Automatic building of Arabic multi dialect text corpora by bootstrapping dialect words , 2013, 2013 1st International Conference on Communications, Signal Processing, and their Applications (ICCSPA).

[13]  Alexis Nasr,et al.  Automatically building a Tunisian Lexicon for Deverbal Nouns , 2014, VarDial@COLING.

[14]  Moha Ennaji,et al.  Multilingualism, Cultural Identity, And Education In Morocco , 2005 .

[15]  Kemal Oflazer,et al.  A Multidialectal Parallel Corpus of Arabic , 2014, LREC.