Wajeez: An Extractive Automatic Arabic Text Summarisation System

The volume of Arabic information is rapidly increasingly nowadays, and thus, access to the corrects is arguably one of the most difficult research problems facing readers and researchers. Text Summarisation Systems are utilised to produce a short text describing significant portions of the original text. That is by selecting the most important sentences, following several steps: preprocessing, stemming, scoring, and summary extraction. Nevertheless, summarisation systems remain still in their infancy for the Arabic language. Therefore, this paper proposes an automatic Arabic text summarisation systems, entitled Wajeez, that introduces a new inclusive scoring formula that generates a final summary from several top-ranking sentences. Wajeez was applied on two different datasets: the Essex Arabic Summaries Corpus (EASC) and a manual summary to assess its performance using the Recall-Oriented Understudy for Gisting Evaluation (ROUGE) set of metrics. In comparison to two other competitions systems, Wajeez performed comparatively well when a title is provisioned to support summarisation.

[1]  Guy Lapalme,et al.  Lakhas, an Arabic summarization system , 2004 .

[2]  Ahmad T. Al-Taani,et al.  Arabic Single-Document Text Summarization Using Particle Swarm Optimization Algorithm , 2017, ACLING.

[3]  Zongda Wu,et al.  A topic modeling based approach to novel document automatic summarization , 2017, Expert Syst. Appl..

[4]  Yassine El Adlouni,et al.  Using Statistical and Semantic Analysis for Arabic Text Summarization , 2017, ICIT 2017.

[5]  David Evans,et al.  Identifying similarity in text: multi-lingual analysis for summarization , 2005 .

[6]  Udo Kruschwitz,et al.  Using Mechanical Turk to Create a Corpus of Arabic Summaries , 2010 .

[7]  Saïd El Alaoui Ouatik,et al.  Arabic text summarization based on graph theory , 2015, 2015 IEEE/ACS 12th International Conference of Computer Systems and Applications (AICCSA).

[8]  Dejun Mu,et al.  Word-sentence co-ranking for automatic extractive text summarization , 2017, Expert Syst. Appl..

[9]  Di Wang,et al.  Automatic Arabic Summarization: A survey of methodologies and systems , 2017, ACLING.

[10]  Ahmad T. Al-Taani,et al.  Hybrid-based Arabic single-document text summarization approach using genatic algorithm , 2016, 2016 7th International Conference on Information and Communication Systems (ICICS).

[11]  Mohamed El Bachir Menai,et al.  Automatic Arabic text summarization: a survey , 2015, Artificial Intelligence Review.

[12]  Duy Duc An Bui,et al.  Extractive text summarization system to aid data extraction from full text in systematic review development , 2016, J. Biomed. Informatics.

[13]  A. Govardhan,et al.  Corpus Based Extractive Document Summarization for Indic Script , 2011, 2011 International Conference on Asian Language Processing.

[14]  Hassan Khotanlou,et al.  Fuzzy evolutionary cellular learning automata model for text summarization , 2016, Swarm Evol. Comput..