Literature Review of Automatic Text Summarization: Research Trend, Dataset and Method

Automatic text summarization can be defined as the process of presenting one or more text documents while maintaining the main information content using an automatic machine with no more than half the original text or less than the original text. Research in the field of text summarization began in the 1950s and until now there is no system that can produce summaries such as professionals or humans. This paper aims to identify and analyze methods, datasets and trends in automatic text summarization research from 2015 to 2019. The method used a systematic literature review (SLR) about automatic text summarization. The results obtained are that research on automatic text summarization is still relevant to date. The extractive approach is still in demand in the past three years because the extractive is easier than abstractive and the opportunity to combine methods is still open, for example using a neuro computing approach, namely the emergence of a new DQN method (Deep Q-Network) which shows comparable results and even better. The text summarization research trend has also undergone a slight change in the past three years where new things have emerged that are trends that are leading to optimization, how to optimize text summarization performance in order to get high accuracy.

[1]  Brian Roark,et al.  OHSU Summarization and Entity Linking Systems , 2009, TAC.

[2]  Chang-Shing Lee,et al.  A fuzzy ontology and its application to news summarization , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[3]  Shashi Pal Singh,et al.  Bilingual automatic text summarization using unsupervised deep learning , 2016, 2016 International Conference on Electrical, Electronics, and Optimization Techniques (ICEEOT).

[4]  Miguel A. Vega-Rodríguez,et al.  Extractive multi-document text summarization using a multi-objective artificial bee colony optimization approach , 2017, Knowl. Based Syst..

[5]  Hari Om,et al.  MCRMR: Maximum coverage and relevancy with minimal redundancy based multi-document summarization , 2019, Expert Syst. Appl..

[6]  Derwin Suhartono,et al.  Single Document Automatic Text Summarization using Term Frequency-Inverse Document Frequency (TF-IDF) , 2016 .

[7]  Lei Shi,et al.  Understanding text corpora with multiple facets , 2010, 2010 IEEE Symposium on Visual Analytics Science and Technology.

[8]  Guy Lapalme,et al.  Fully Abstractive Approach to Guided Summarization , 2012, ACL.

[9]  Yogesh Kumar Meena,et al.  Use of fuzzy logic and wordnet for improving performance of extractive automatic text summarization , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[10]  Phyllis B. Baxendale,et al.  Machine-Made Index for Technical Literature - An Experiment , 1958, IBM J. Res. Dev..

[11]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[12]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[13]  Ahmed Guessoum,et al.  A Supervised Approach to Arabic Text Summarization Using AdaBoost , 2015, WorldCIST.

[14]  Manisha Naik Gaonkar,et al.  Extractive text summarization by feature-based sentence extraction using rule-based concept , 2017, 2017 2nd IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology (RTEICT).

[15]  Kazuhiko Ohe,et al.  TEXT2TABLE: Medical Text Summarization System Based on Named Entity Recognition and Modality Identification , 2009, BioNLP@HLT-NAACL.

[16]  Unsupervised Text Summarization Using Sentence Embeddings , 2016 .

[17]  A. Waibel,et al.  A Literature Survey on Information Extraction and Text Summarization , 1997 .

[18]  Romi Satria Wahono,et al.  A Systematic Literature Review of Software Defect Prediction: Research Trends, Datasets, Methods and Frameworks , 2015 .

[19]  Mahmood Yousefi-Azar,et al.  Text summarization using unsupervised deep learning , 2017, Expert Syst. Appl..

[20]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[21]  Yogesh Kumar Meena,et al.  Domain Independent Framework for Automatic Text Summarization , 2015 .

[22]  Lv Cuiling,et al.  Text Automatic Summarization Generation Algorithm for English Teaching , 2016, 2016 International Conference on Intelligent Transportation, Big Data & Smart City (ICITBS).

[23]  Eréndira Rendón Lara,et al.  Text Summarization by Sentence Extraction Using Unsupervised Learning , 2008, MICAI.

[24]  Hassan Khotanlou,et al.  Fuzzy evolutionary cellular learning automata model for text summarization , 2016, Swarm Evol. Comput..

[25]  Chitu Okoli,et al.  A Guide to Conducting a Systematic Literature Review of Information Systems Research , 2010 .

[26]  Latesh G. Malik,et al.  ATSSC: Development of an approach based on soft computing for text summarization , 2017, Comput. Speech Lang..

[27]  Kalina Bontcheva,et al.  Robust Generic and Query-based Summarization , 2003, EACL.

[28]  Djoko Budiyanto Setyohadi,et al.  Summarizing Indonesian text automatically by using sentence scoring and decision tree , 2017, 2017 2nd International conferences on Information Technology, Information Systems and Electrical Engineering (ICITISEE).

[29]  P Krishnaprasad,et al.  Malayalam text summarization: An extractive approach , 2016, 2016 International Conference on Next Generation Intelligent Systems (ICNGIS).

[30]  Eduard H. Hovy,et al.  Automated Text Summarization and the SUMMARIST System , 1998, TIPSTER.

[31]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[32]  Milad Moradi,et al.  CIBS: A biomedical text summarizer using topic-based sentence clustering , 2018, J. Biomed. Informatics.

[33]  Tao Jiang,et al.  The Mixture of Textrank and Lexrank Techniques of Single Document Automatic Summarization Research in Tibetan , 2016, 2016 8th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC).

[34]  Peng Zhang,et al.  Abstractive Text Summarization with Multi-Head Attention , 2019, 2019 International Joint Conference on Neural Networks (IJCNN).

[35]  Horacio Saggion,et al.  A text summarization method based on fuzzy rules and applicable to automated assessment , 2019, Expert Syst. Appl..

[36]  Naoto Katoh,et al.  Syntax-Driven Sentence Revision for Broadcast News Summarization , 2009 .

[37]  T. Martin,et al.  Similarity-Based Estimation for Document Summarization using Fuzzy Sets , 2007 .

[38]  Claire Cardie,et al.  Multidocument Summarization via Information Extraction , 2001, HLT.

[39]  Guy Lapalme,et al.  Framework for Abstractive Summarization using Text-to-Text Generation , 2011, Monolingual@ACL.

[40]  Xiaolei Wang,et al.  Personalized PageRank Based Multi-document Summarization , 2008, IEEE International Workshop on Semantic Computing and Systems.

[41]  Retno Kusumaningrum,et al.  Multi document summarization for the Indonesian language based on latent dirichlet allocation and significance sentence , 2018, 2018 International Conference on Information and Communications Technology (ICOIACT).

[42]  Yanjun Wu,et al.  Deep reinforcement learning for extractive document summarization , 2018, Neurocomputing.

[43]  Aqil M. Azmi,et al.  An abstractive Arabic text summarizer with user controlled granularity , 2018, Inf. Process. Manag..

[44]  Atul Patel,et al.  Evaluation of Unsupervised Learning based Extractive Text Summarization Technique for Large Scale Review and Feedback Data , 2017 .

[45]  Kamal Sarkar Automatic Single Document Text Summarization Using Key Concepts in Documents , 2013, J. Inf. Process. Syst..

[46]  Rakesh Chandra Balabantaray,et al.  Hybrid Approach To Abstractive Summarization , 2018 .

[47]  M. Nasipuri,et al.  Using Machine Learning for Medical Document Summarization , 2011 .