Automatic Arabic Summarization: A survey of methodologies and systems

Abstract Text summarization has been a field of intensive research over the last 50 years, especially for commonly-used and relatively simple-grammar languages such as English. Moreover, the unprecedented growth in the amount of online information available in many languages to users and businesses, including news articles and social media, has made it difficult and time consuming for users to identify and consume sought after content. Hence, an automatic text summarization for various languages to generate accurate and relevant summaries from the huge amount of information available is essential nowadays. Techniques and methodologies for Arabic text summarization are still immature due to the inherent complexity of the Arabic language in terms of both structure and morphology. This paper describes the main challenges for Arabic text summarization and surveys the various methodologies and systems in the literature. This survey would be a good basis for the design of an Arabic automatic text summarization that combines the various “good” features of the existing systems and dismiss the “not-so-good” features.

[1]  Fatima T. AL-Khawaldeh Lexical Cohesion and Entailment based Segmentation for Arabic Text Summarization ( LCEAS ) , 2015 .

[2]  Aqil M. Azmi,et al.  A text summarizer for Arabic , 2012, Comput. Speech Lang..

[3]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[4]  Alaa Hamouda,et al.  An Ontology-based Summarization System for Arabic Documents (OSSAD) , 2013 .

[5]  Philippe Blache,et al.  Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization , 2014, J. King Saud Univ. Comput. Inf. Sci..

[6]  Nazlia Omar,et al.  Automatic Arabic text summarization using clustering and keyphrase extraction , 2014, Proceedings of the 6th International Conference on Information Technology and Multimedia.

[7]  Lamia Hadrich Belguith,et al.  Digital Learning for Summarizing Arabic Documents , 2010, IceTAL.

[8]  Ahmed Ibrahim,et al.  Improve the Automatic Summarization of Arabic Text Depending on Rhetorical Structure Theory , 2013, 2013 12th Mexican International Conference on Artificial Intelligence.

[9]  Fahad Alotaiby,et al.  New approaches to automatic headline generation for Arabic documents , 2012 .

[10]  A. Haboush Summerization Model Using Clustering Techniques , 2012 .

[11]  Udo Kruschwitz,et al.  Exploring Clustering for Multi-document Arabic Summarisation , 2011, AIRS.

[12]  Imed Zitouni,et al.  Natural Language Processing of Semitic Languages , 2014, Theory and Applications of Natural Language Processing.

[13]  Mohamed El Bachir Menai,et al.  Automatic Arabic text summarization: a survey , 2015, Artificial Intelligence Review.

[14]  Ibrahim Sobh,et al.  An Optimized Dual Classification System for Arabic Extractive Generic Text Summarization , 2007 .

[15]  Saïd El Alaoui Ouatik,et al.  Arabic text summarization based on latent semantic analysis to enhance arabic documents clustering , 2013, ArXiv.

[16]  Ghassan Kanaan,et al.  Proper Noun Extracting Algorithm for Arabic Language , 2011 .

[17]  Udo Kruschwitz,et al.  Multi-document arabic text summarisation , 2011, 2011 3rd Computer Science and Electronic Engineering Conference (CEEC).

[18]  Dianne P. O'Leary,et al.  Arabic/English Multi-document Summarization with CLASSY - The Past and the Future , 2008, CICLing.

[19]  Ahmed Guessoum,et al.  A Supervised Approach to Arabic Text Summarization Using AdaBoost , 2015, WorldCIST.

[20]  A. Ibrahim,et al.  Arabic text summarization using Rhetorical Structure Theory , 2012, 2012 8th International Conference on Informatics and Systems (INFOS).

[21]  Udo Kruschwitz,et al.  Experimenting with Automatic Text Summarisation for Arabic , 2009, LTC.

[22]  Mahmoud El-Haj,et al.  Using a Keyness Metric for Single and Multi Document Summarisation , 2013 .

[23]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[24]  George Giannakopoulos,et al.  Multi-document multilingual summarization and evaluation tracks in ACL 2013 MultiLing Workshop , 2013 .

[25]  Guy Lapalme,et al.  Lakhas, an Arabic summarization system , 2004 .

[26]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[27]  Gurpreet Singh Lehal,et al.  A Survey of Text Summarization Extractive Techniques , 2010 .

[28]  George D. C. Cavalcanti,et al.  Assessing sentence scoring techniques for extractive text summarization , 2013, Expert Syst. Appl..

[29]  Tarek El-Shishtawy,et al.  Multi-Topic Multi-Document Summarizer , 2014, ArXiv.

[30]  Qasem A. Al-Radaideh,et al.  ARABIC TEXT SUMMARIZATION USING AGGREGATE SIMILARITY , 2014 .

[31]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.