Improving Arabic Texts Morphological Disambiguation Using a Possibilistic Classifier

Morphological ambiguity is an important problem that has been studied through different approaches. We investigate, in this paper, some classification methods to disambiguate Arabic morphological features of non-vocalized texts. A possibilistic approach is improved and proposed to handle imperfect training and test datasets. We introduce a data transformation method to convert the imperfect dataset to a perfect one. We compare the disambiguation results of classification approaches to results given by the possibilistic classifier dealing with imperfection context.

[1]  King Abdullah,et al.  Knowledge Discovery in Al-Hadith Using Text Classification Algorithm , 2010 .

[2]  Narjès Bellamine Ben Saoud,et al.  A Possibilistic Approach for the Automatic Morphological Disambiguation of Arabic Texts , 2012, 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[3]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[4]  Birger Andersson,et al.  Natural Language Processing and Information Systems , 2003, Lecture Notes in Computer Science.

[5]  Nizar Habash,et al.  Arabic Diacritization through Full Morphological Tagging , 2007, NAACL.

[6]  Mathieu Serrurier,et al.  Possibilistic classifiers for numerical data , 2013, Soft Comput..

[7]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[8]  Nizar Habash,et al.  Arabic Morphological Tagging, Diacritization, and Lemmatization Using Lexeme Models and Feature Ranking , 2008, ACL.

[9]  Lahbib Zenkouar,et al.  POS Tagging in Amazighe Using Support Vector Machines and Conditional Random Fields , 2011, NLDB.

[10]  Nizar Habash,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[11]  Judea Pearl,et al.  Chapter 2 – BAYESIAN INFERENCE , 1988 .

[12]  Charu C. Aggarwal,et al.  A Survey of Text Classification Algorithms , 2012, Mining Text Data.

[13]  Mark Steedman,et al.  Example Selection for Bootstrapping Statistical Parsers , 2003, NAACL.

[14]  Jan Hajic,et al.  Morphological Tagging: Data vs. Dictionaries , 2000, ANLP.

[15]  Didier Dubois,et al.  Possibility Theory - An Approach to Computerized Processing of Uncertainty , 1988 .

[16]  James Allan,et al.  An interactive algorithm for asking and incorporating feature feedback into support vector machines , 2007, SIGIR.

[17]  Khaled Mellouli,et al.  Naïve possibilistic network classifiers , 2009, Fuzzy Sets Syst..

[18]  J. Wiebe Proceedings of the 1st North American chapter of the Association for Computational Linguistics conference , 2000 .

[19]  S. Khoja,et al.  APT: Arabic Part-of-speech Tagger , 2001 .

[20]  Ibrahim Bounhas,et al.  Organizing Contextual Knowledge for Arabic Text Disambiguation and Terminology Extraction , 2011 .

[21]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[22]  Nizar Habash,et al.  Automatic Morphological Enrichment of a Morphologically Underspecified Treebank , 2013, NAACL.

[23]  Dov M. Gabbay,et al.  Handbook of defeasible reasoning and uncertainty management systems: volume 2: reasoning with actual and potential contradictions , 1998 .

[24]  Didier Dubois,et al.  Possibility Theory: Qualitative and Quantitative Aspects , 1998 .

[25]  Narjès Bellamine Ben Saoud,et al.  Arabic Morphological Analysis and Disambiguation Using a Possibilistic Classifier , 2012, ICIC.

[26]  De-Shuang Huang,et al.  Intelligent Computing Theories and Application , 2016, Lecture Notes in Computer Science.

[27]  Manny Rayner,et al.  Spoken Language Understanding via Supervised Learning and Linguistically Motivated Features , 2010, NLDB.

[28]  Abdelwadood Moh'd. Mesleh Support Vector Machines based Arabic Language Text Classification System: Feature Selection Comparative Study , 2007, SCSS.