Abstractive Text Summarization based on Improved Semantic Graph Approach

The goal of abstractive summarization of multi-documents is to automatically produce a condensed version of the document text and maintain the significant information. Most of the graph-based extractive methods represent sentence as bag of words and utilize content similarity measure, which might fail to detect semantically equivalent redundant sentences. On other hand, graph based abstractive method depends on domain expert to build a semantic graph from manually created ontology, which requires time and effort. This work presents a semantic graph approach with improved ranking algorithm for abstractive summarization of multi-documents. The semantic graph is built from the source documents in a manner that the graph nodes denote the predicate argument structures (PASs)—the semantic structure of sentence, which is automatically identified by using semantic role labeling; while graph edges represent similarity weight, which is computed from PASs semantic similarity. In order to reflect the impact of both document and document set on PASs, the edge of semantic graph is further augmented with PAS-to-document and PAS-to-document set relationships. The important graph nodes (PASs) are ranked using the improved graph ranking algorithm. The redundant PASs are reduced by using maximal marginal relevance for re-ranking the PASs and finally summary sentences are generated from the top ranked PASs using language generation. Experiment of this research is accomplished using DUC-2002, a standard dataset for document summarization. Experimental findings signify that the proposed approach shows superior performance than other summarization approaches.

[1]  Jason Weston,et al.  Large Scale Application of Neural Network Based Semantic Role Labeling for Automated Relation Extraction from Biomedical Texts , 2009, PloS one.

[2]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[3]  Naomie Salim,et al.  Fuzzy Logic Based Method for Improving Text Summarization , 2009, ArXiv.

[4]  Martin Porter,et al.  Snowball: A language for stemming algorithms , 2001 .

[5]  Jan Snajder,et al.  Event graphs for information retrieval and multi-document summarization , 2014, Expert Syst. Appl..

[6]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[7]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[8]  Benoît Favre,et al.  Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions , 2015, EMNLP.

[9]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[10]  Angelo Chianese,et al.  SmaCH: A Framework for Smart Cultural Heritage Spaces , 2014, 2014 Tenth International Conference on Signal-Image Technology and Internet-Based Systems.

[11]  Salvatore Cuomo,et al.  A Smart GPU Implementation of an Elliptic Kernel for an Ocean Global Circulation Model , 2013 .

[12]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[13]  Chang-Shing Lee,et al.  A fuzzy ontology and its application to news summarization , 2005, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[14]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[15]  Naoto Katoh,et al.  Syntax-Driven Sentence Revision for Broadcast News Summarization , 2009 .

[16]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[17]  Le Sun,et al.  A cue-based hub-authority approach for multi-document text summarization , 2005, 2005 International Conference on Natural Language Processing and Knowledge Engineering.

[18]  Guy Lapalme,et al.  Framework for Abstractive Summarization using Text-to-Text Generation , 2011, Monolingual@ACL.

[19]  Vishal Gupta,et al.  Recent automatic text summarization techniques: a survey , 2016, Artificial Intelligence Review.

[20]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[21]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[22]  Carlo Strapparava,et al.  Corpus-based and Knowledge-based Measures of Text Semantic Similarity , 2006, AAAI.

[23]  P. Jaccard,et al.  Etude comparative de la distribution florale dans une portion des Alpes et des Jura , 1901 .

[24]  David W. Conrath,et al.  Semantic Similarity Based on Corpus Statistics and Lexical Taxonomy , 1997, ROCLING/IJCLCLP.

[25]  Albert Gatt,et al.  SimpleNLG: A Realisation Engine for Practical Applications , 2009, ENLG.

[26]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[27]  Furu Wei,et al.  A document-sensitive graph model for multi-document summarization , 2010, Knowledge and Information Systems.

[28]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[29]  Mohamed Abdel Fattah A hybrid machine learning model for multi-document summarization , 2013, Applied Intelligence.

[30]  Piji Li,et al.  Abstractive Multi-Document Summarization via Phrase Selection and Merging , 2015, ACL.

[31]  Charles F. Greenbacker Towards a Framework for Abstractive Summarization of Multimodal Documents , 2011, ACL.

[32]  M. Aref,et al.  Semantic graph reduction approach for abstractive Text Summarization , 2012, 2012 Seventh International Conference on Computer Engineering & Systems (ICCES).

[33]  Paul Over,et al.  Intrinsic Evaluation of Generic News Text Summarization Systems , 2003 .

[34]  Ani Nenkova,et al.  Evaluating Content Selection in Summarization: The Pyramid Method , 2004, NAACL.

[35]  A. Chianese,et al.  A "smart" multimedia guide for indoor contextual navigation in cultural heritage applications , 2013, International Conference on Indoor Positioning and Indoor Navigation.

[36]  Tat-Seng Chua,et al.  Document concept lattice for text understanding and summarization , 2007, Inf. Process. Manag..

[37]  Inderjeet Mani,et al.  Summarizing Similarities and Differences Among Related Documents , 1997, Information Retrieval.

[38]  Shuzhi Sam Ge,et al.  Weighted graph model based sentence clustering and ranking for document summarization , 2011, The 4th International Conference on Interaction Sciences.

[39]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[40]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[41]  Dipanjan Das Andr,et al.  A Survey on Automatic Text Summarization , 2007 .

[42]  Narayana Prasad Padhy,et al.  Comparison of Particle Swarm Optimization and Genetic Algorithm for TCSC-based Controller Design , 2007 .

[43]  Xiaojun Wan,et al.  Improved Affinity Graph Based Multi-Document Summarization , 2006, NAACL.

[44]  Noah A. Smith,et al.  Toward Abstractive Summarization Using Semantic Representations , 2018, NAACL.

[45]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[46]  Sanda M. Harabagiu,et al.  Generating Single and Multi-Documat Summaries with GISTextrer , 2002 .

[47]  Khai Nguyen,et al.  TSGVi: a graph-based summarization system for Vietnamese documents , 2012, J. Ambient Intell. Humaniz. Comput..

[48]  Salvatore Cuomo,et al.  IoT-based collaborative reputation system for associating visitors and artworks in a cultural scenario , 2017, Expert Syst. Appl..

[49]  Guy Lapalme,et al.  Fully Abstractive Approach to Guided Summarization , 2012, ACL.

[50]  Jackie Chi Kit Cheung,et al.  Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain , 2013, ACL.

[51]  Regina Barzilay,et al.  Sentence Fusion for Multidocument News Summarization , 2005, CL.

[52]  Lalit M. Patnaik,et al.  Genetic algorithms: a survey , 1994, Computer.

[53]  Ahmed Guessoum,et al.  Concept generalization and fusion for abstractive sentence generation , 2016, Expert Syst. Appl..

[54]  Salvatore Cuomo,et al.  A Regularized MRI Image Reconstruction based on Hessian Penalty Term on CPU/GPU Systems , 2013, ICCS.