Ant colony algorithm for Arabic word sense disambiguation through English lexical information

The ability to identify the intended meanings of words in context is a central research topic in natural language. Many solutions exist for Word Sense Disambiguation WSD in different languages, such as English or French, but research on Arabic WSD remains limited. The main bottleneck is the lack of resources. In this paper, we show that it is possible to build a WSD system for the Arabic language thanks to the Arabic WordNet and its connections to the English Princeton WordNet. Given that the Arabic WordNet does not contain definitions for synsets, we construct a dictionary that maps the Princeton WordNet definitions to the Arabic WordNet. We also create an Arabic evaluation corpus and gold standard. We then exploit this dictionary and evaluation corpus to run and evaluate an adapted ant colony algorithm on Arabic text that can use the Lesk similarity measure thanks to definition mapping. The algorithm shows a performance of approximately 80% compared to the random baseline of 78.9%.

[1]  Didier Schwab,et al.  Désambiguïsation lexicale de textes : efficacité qualitative et temporelle d’un algorithme à colonies de fourmis [Lexical disambiguation of texts: qualitative and temporal efficiency of an ant colony algorithm] , 2013, TAL.

[2]  Luca Maria Gambardella,et al.  Ant Algorithms for Discrete Optimization , 1999, Artificial Life.

[3]  Didier Schwab,et al.  A Global Ant Colony Algorithm for Word Sense Disambiguation Based on Semantic Relatedness , 2011, PAAMS.

[4]  C. Fellbaum,et al.  Arabic WordNet and the Challenges of Arabic , 2006, BCS.

[5]  Doug Downey,et al.  Local and Global Algorithms for Disambiguation to Wikipedia , 2011, ACL.

[6]  Didier Schwab,et al.  Désambigu\"ısation lexicale par propagation de mesures sémantiques locales par algorithmes à colonies de fourmis , 2011 .

[7]  Andreas Nürnberger,et al.  Arabic/English word translation disambiguation using parallel corpora and matching schemes , 2008, EAMT.

[8]  Chuntian Cheng,et al.  A Parallel Ant Colony Algorithm for Bus Network Optimization , 2007, Comput. Aided Civ. Infrastructure Eng..

[9]  Kay W. Axhausen,et al.  Optimization of Large Transport Networks Using the Ant Colony Heuristic , 2009, Comput. Aided Civ. Infrastructure Eng..

[10]  Michael E. Lesk,et al.  Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from an ice cream cone , 1986, SIGDOC '86.

[11]  Roberto Navigli,et al.  Word sense disambiguation: A survey , 2009, CSUR.

[12]  Ted Pedersen,et al.  An Adapted Lesk Algorithm for Word Sense Disambiguation Using WordNet , 2002, CICLing.

[13]  Mohamed El Bachir Menai,et al.  Word Sense Disambiguation Using an Evolutionary Approach , 2014, Informatica.

[14]  Wojdan Alsaeedan,et al.  Genetic Algorithm for Arabic Word Sense Disambiguation , 2012, 2012 13th ACIS International Conference on Software Engineering, Artificial Intelligence, Networking and Parallel/Distributed Computing.

[15]  Marco Dorigo,et al.  Ant system: optimization by a colony of cooperating agents , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[16]  Roslan Ismail,et al.  A Survey of Arabic language Support in Semantic web , 2010 .

[17]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[18]  Kamel Smaïli,et al.  Evaluation of Topic Identification Methods on Arabic Corpora , 2011, J. Digit. Inf. Manag..

[19]  S. M. Fakhrahmad,et al.  A New Approach to Word Sense Disambiguation Based on Context Similarity , 2011 .

[20]  Mounir Zrigui,et al.  Lexical Disambiguation of Arabic Language: An Experimental Study , 2012, Polytech. Open Libr. Int. Bull. Inf. Technol. Sci..

[21]  Nizar Habash,et al.  Arabic Tokenization, Part-of-Speech Tagging and Morphological Disambiguation in One Fell Swoop , 2005, ACL.

[22]  Mona Diab An Unsupervised Approach for Bootstrapping Arabic Sense Tagging , 2004 .

[23]  Christiane Fellbaum,et al.  Introducing the Arabic WordNet project , 2006 .

[24]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[25]  Robert A. Cote CHOOSING ONE DIALECT FOR THE ARABIC SPEAKING WORLD: A STATUS PLANNING DILEMMA , 2009 .

[26]  Samir Elmougy,et al.  Naïve Bayes Classifier for Arabic Word Sense Disambiguation , 2008 .

[27]  Didier Schwab,et al.  Ant Colony Algorithm for the Unsupervised Word Sense Disambiguation of Texts: Comparison and Evaluation , 2012, COLING.

[28]  Didier Schwab,et al.  GETALP : Propagation of a Lesk Measure through an Ant Colony Algorithm , 2013 .

[29]  Zakaria Elberrichi,et al.  Arabic text categorization: a comparative study of different representation modes , 2012, Int. Arab J. Inf. Technol..

[30]  Mounir Zrigui,et al.  Ambiguous Arabic Words Disambiguation: The Results , 2009, RANLP.

[31]  Mark Liberman,et al.  A New Approach to Lexical Disambiguation of Arabic Text , 2010, EMNLP.

[32]  Didier Schwab,et al.  GETALP System : Propagation of a Lesk Measure through an Ant Colony Algorithm , 2013, SemEval@NAACL-HLT.

[33]  Mohamed El Bachir Menai,et al.  Word sense disambiguation using evolutionary algorithms - Application to Arabic language , 2014, Comput. Hum. Behav..

[34]  Christiane Fellbaum,et al.  Arabic WordNet. Current State and Future Extensions , 2008 .

[35]  Khaled Shaalan,et al.  Arabic Natural Language Processing: Challenges and Solutions , 2009, TALIP.

[36]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.