MAGEAD: A Morphological Analyzer and Generator for the Arabic Dialects

We present MAGEAD, a morphological analyzer and generator for the Arabic language family. Our work is novel in that it explicitly addresses the need for processing the morphology of the dialects. MAGEAD performs an on-line analysis to or generation from a root+pattern+features representation, it has separate phonological and orthographic representations, and it allows for combining morphemes from different dialects. We present a detailed evaluation of MAGEAD.

[1]  Ibrahim A. Al-Kharashi,et al.  Arabic morphological analysis techniques: A comprehensive survey , 2004, J. Assoc. Inf. Sci. Technol..

[2]  Kenneth R. Beesley,et al.  Arabic Morphology Using Only Finite-State Operations , 1998, SEMITIC@COLING.

[3]  M. Maamouri,et al.  The Penn Arabic Treebank: Building a Large-Scale Annotated Arabic Corpus , 2004 .

[4]  Mehryar Mohri,et al.  A Rational Design for a Weighted Finite-State Transducer Library , 1997, Workshop on Implementing Automata.

[5]  Nizar Habash,et al.  Morphological Analysis and Generation for Arabic Dialects , 2005, SEMITIC@ACL.

[6]  Kareem Darwish,et al.  Building a Shallow Arabic Morphological Analyser in One Day , 2002, SEMITIC@ACL.

[7]  Markus Walther Computational nonlinear morphology with emphasis on semitic languages , 2002, Computational Linguistics.

[8]  Steven Bird,et al.  One-Level Phonology: Autosegmental Representations and Rules as Finite Automata , 1994, Comput. Linguistics.

[9]  Martin Kay,et al.  Nonconcatenative Finite-State Morphology , 1987, EACL.

[10]  Kimmo Koskenniemi,et al.  Finite-state description of Semitic morphology: a case study of ancient Accadian , 1988, COLING.

[11]  Kimmo Koskenniemi,et al.  Two-Level Morphology , 1983 .

[12]  Nizar Habash,et al.  Large Scale Lexeme Based Arabic Morphological Generation , 2004 .

[13]  George Anton Kiraz,et al.  Multitiered nonlinear morphology using multitape finite automata: a case study on Syriac and Arabic , 2000, CL.

[14]  Nizar Habash,et al.  Developing and Using a Pilot Dialectal Arabic Treebank , 2006, LREC.

[15]  Stephen G. Pulman,et al.  A feature-based formalism for two-level phonology: a description and implementation , 1993, Comput. Speech Lang..

[16]  J. J. Mc Carthy A Prosodic Theory of Nonconcatenative Morphology , 1981 .