Automatic Arabic text summarization: a survey

This survey investigates several research studies that have been conducted in the field of Arabic text summarization. Specifically, it addresses summarization and evaluation methods, as well as the corpora used in those studies. The literature in this field is fairly limited and relatively new compared to the available literature on other languages, such as English. Therefore, there exists a great opportunity for further research in Arabic text summarization. In addition, one of the largest problems in Arabic summarization was the absence of Arabic gold standard summaries, although this situation is beginning to change, especially with the inclusion of Arabic language as a part of the corpora and tasks in the TAC 2011 MultiLing Pilot and ACL 2013 MultiLing Workshop. Finally, providing the required corpora and adopting them in Arabic summarization studies is an essential demand.

[1]  Udo Kruschwitz,et al.  Creating language resources for under-resourced languages: methodologies, and experiments with Arabic , 2015, Lang. Resour. Evaluation.

[2]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[3]  Ahmed Ibrahim,et al.  Rhetorical Representation and Vector Representation in Summarizing Arabic Text , 2013, NLDB.

[4]  Ibrahim Sobh,et al.  An Optimized Dual Classification System for Arabic Extractive Generic Text Summarization , 2007 .

[5]  Erik Cambria,et al.  Jumping NLP Curves: A Review of Natural Language Processing Research [Review Article] , 2014, IEEE Computational Intelligence Magazine.

[6]  Kathleen R. McKeown,et al.  Identifying similarities and differences across English and Arabic news , 2005 .

[7]  Fatima T. AL-Khawaldeh Lexical Cohesion and Entailment based Segmentation for Arabic Text Summarization ( LCEAS ) , 2015 .

[8]  Joshua Goodman,et al.  Multi-Document Summarization by Maximizing Informative Content-Words , 2007, IJCAI.

[9]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[10]  Mahmoud El-Haj,et al.  Using a Keyness Metric for Single and Multi Document Summarisation , 2013 .

[11]  Brigham Young The Corpus of Contemporary American English as the first reliable monitor corpus of English , 2010 .

[12]  Lamia Hadrich Belguith,et al.  Étude comparative entre trois approches de résumé automatique de documents arabes (Comparative Study of Three Approaches to Automatic Summarization of Arabic Documents) [in French] , 2012, JEP/TALN/RECITAL.

[13]  S. S. Ismail Representation using Rich Semantic Graph : A Case Study , 2013 .

[14]  Jugal K. Kalita,et al.  Comparing Twitter Summarization Algorithms for Multiple Post Summaries , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[15]  Hideki Mima,et al.  An Application and Evaluation of the C/NC-value Approach for the Automatic term Recognition of Multi-Word units in Japanese , 2000 .

[16]  Paul Rayson,et al.  Comparing Corpora using Frequency Profiling , 2000, Proceedings of the workshop on Comparing corpora -.

[17]  Mohamed Shaheen,et al.  Arabic Question Answering: Systems, Resources, Tools, and Future Trends , 2014, Arabian Journal for Science and Engineering.

[18]  Dianne P. O'Leary,et al.  Arabic/English Multi-document Summarization with CLASSY - The Past and the Future , 2008, CICLing.

[19]  Rasim M. Alguliyev,et al.  Formulation of document summarization as a 0-1 nonlinear programming problem , 2013, Comput. Ind. Eng..

[20]  Wojdan Alsaeedan,et al.  Swarm intelligence for natural language processing , 2015, Int. J. Artif. Intell. Soft Comput..

[21]  Elena Lloret,et al.  Tackling redundancy in text summarization through different levels of language analysis , 2013, Comput. Stand. Interfaces.

[22]  Ani Nenkova,et al.  Summarization evaluation for text and speech: issues and approaches , 2006, INTERSPEECH.

[23]  Rasim M. Alguliyev,et al.  Multiple documents summarization based on evolutionary optimization algorithm , 2013, Expert Syst. Appl..

[24]  Lamia Hadrich Belguith,et al.  Digital Learning for Summarizing Arabic Documents , 2010, IceTAL.

[25]  Eric Atwell,et al.  The design of a corpus of Contemporary Arabic , 2006 .

[26]  George Giannakopoulos,et al.  Multi-document multilingual summarization and evaluation tracks in ACL 2013 MultiLing Workshop , 2013 .

[27]  Philippe Blache,et al.  Automatic Summarization , 2014, NLP of Semitic Languages.

[28]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[29]  George Giannakopoulos,et al.  TAC2011 MultiLing Pilot Overview , 2011, TAC.

[30]  Saïd El Alaoui Ouatik,et al.  Arabic text summarization based on latent semantic analysis to enhance arabic documents clustering , 2013, ArXiv.

[31]  Anja Habacha Chaïbi,et al.  Topic Segmentation for Textual Document Written in Arabic Language , 2014, KES.

[32]  George Giannakopoulos,et al.  Multi-document multilingual summarization corpus preparation, Part 1: Arabic, English, Greek, Chinese, Romanian , 2013 .

[33]  Ahmed Guessoum,et al.  A Supervised Approach to Arabic Text Summarization Using AdaBoost , 2015, WorldCIST.

[34]  Eduard Hovy,et al.  Manual and automatic evaluation of summaries , 2002, ACL 2002.

[35]  Salma Jamoussi,et al.  A hybrid method for extracting relations between Arabic named entities , 2014, J. King Saud Univ. Comput. Inf. Sci..

[36]  Michael Gasser,et al.  Linguistic Introduction: The Orthography, Morphology and Syntax of Semitic Languages , 2014, NLP of Semitic Languages.

[37]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[38]  Nawal A. El-Fishawy,et al.  Arabic summarization in Twitter social network , 2014 .

[39]  Karin C. Ryding,et al.  A Reference Grammar of Modern Standard Arabic , 2005 .

[40]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[41]  Xin Liu,et al.  Generic text summarization using relevance measure and latent semantic analysis , 2001, SIGIR '01.

[42]  Rajeev Sangal,et al.  Proceedings of the 20th international joint conference on Artifical intelligence , 2007 .

[43]  Karen Sparck Jones,et al.  Book Reviews: Evaluating Natural Language Processing Systems: An Analysis and Review , 1996, CL.

[44]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[45]  Guy Lapalme,et al.  Lakhas, an Arabic summarization system , 2004 .

[46]  Bassam H. Hammo,et al.  Evaluation of Query-Based Arabic Text Summarization System , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[47]  Mahmoud El-Haj,et al.  KALIMAT a multipurpose Arabic corpus , 2013 .

[48]  George Giannakopoulos,et al.  AutoSummENG and MeMoG in Evaluating Guided Summaries , 2011, TAC.

[49]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[50]  Yong Wang,et al.  Using Model Trees for Classification , 1998, Machine Learning.

[51]  Ji-Wei Wu,et al.  A Discrete Particle Swarm Optimization Algorithm for Domain Independent Linear Text Segmentation , 2010, 2010 IEEE International Conference on Granular Computing.

[52]  Udo Kruschwitz,et al.  University of Essex at the TAC 2011 MultiLingual Summarisation Pilot , 2011, TAC.

[53]  Udo Kruschwitz,et al.  Using Mechanical Turk to Create a Corpus of Arabic Summaries , 2010 .

[54]  Wai Lam,et al.  MEAD - A Platform for Multidocument Multilingual Text Summarization , 2004, LREC.

[55]  Naomie Salim,et al.  Fuzzy swarm diversity hybrid model for text summarization , 2010, Inf. Process. Manag..

[56]  Fuhui Long,et al.  Feature selection based on mutual information criteria of max-dependency, max-relevance, and min-redundancy , 2003, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Tim Buckwalter,et al.  A Frequency Dictionary of Arabic: Core Vocabulary for Learners , 2010 .

[58]  Dragomir R. Radev,et al.  Generating summaries of multiple news articles , 1995, SIGIR '95.

[59]  Ani Nenkova,et al.  Automatic Summarization , 2011, ACL.

[60]  Aqil M. Azmi,et al.  A text summarizer for Arabic , 2012, Comput. Speech Lang..

[61]  Reem Bassiouney,et al.  Arabic Language and Linguistics , 2012 .

[62]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[63]  Ahmed Ibrahim,et al.  Improve the Automatic Summarization of Arabic Text Depending on Rhetorical Structure Theory , 2013, 2013 12th Mexican International Conference on Artificial Intelligence.

[64]  Fahad Alotaiby,et al.  New approaches to automatic headline generation for Arabic documents , 2012 .

[65]  Qasem A. Al-Radaideh,et al.  ARABIC TEXT SUMMARIZATION USING AGGREGATE SIMILARITY , 2014 .

[66]  Udo Kruschwitz,et al.  Multi-document arabic text summarisation , 2011, 2011 3rd Computer Science and Electronic Engineering Conference (CEEC).

[67]  Paul Over,et al.  DUC in context , 2007, Inf. Process. Manag..

[68]  Bernardo Magnini,et al.  Optimizing Textual Entailment Recognition Using Particle Swarm Optimization , 2009, TextInfer@ACL.

[69]  Udo Kruschwitz,et al.  Exploring Clustering for Multi-document Arabic Summarisation , 2011, AIRS.

[70]  George Giannakopoulos,et al.  Summary Evaluation: Together We Stand NPowER-ed , 2013, CICLing.

[71]  Khaled Shaalan,et al.  Arabic Natural Language Processing: Challenges and Solutions , 2009, TALIP.

[72]  Thierry Poibeau,et al.  Automatic Text Summarization: Past, Present and Future , 2013, Multi-source, Multilingual Information Extraction and Summarization.

[73]  Alaa Hamouda,et al.  An Ontology-based Summarization System for Arabic Documents (OSSAD) , 2013 .

[74]  Philippe Blache,et al.  Minimum redundancy and maximum relevance for single and multi-document Arabic text summarization , 2014, J. King Saud Univ. Comput. Inf. Sci..

[75]  Tarek El-Shishtawy,et al.  Multi-Topic Multi-Document Summarizer , 2014, ArXiv.

[76]  Ani Nenkova,et al.  Beyond SumBasic: Task-focused summarization with sentence simplification and lexical expansion , 2007, Information Processing & Management.

[77]  Nazlia Omar,et al.  Automatic Arabic text summarization using clustering and keyphrase extraction , 2014, Proceedings of the 6th International Conference on Information Technology and Multimedia.

[78]  Elena Lloret,et al.  Text summarisation in progress: a literature review , 2011, Artificial Intelligence Review.

[79]  Cheol-Young Ock,et al.  Word sense disambiguation as a traveling salesman problem , 2013, Artificial Intelligence Review.