Abstractive Summarization: A Hybrid Approach for the Compression of Semantic Graphs

Customization of information from web documents is an immense job that involves mainly the shortening of original texts. This task is carried out using summarization techniques. In general, an automatically generated summary is of two types-extractive and abstractive. Extractive methods use surface level and statistical features for the selection of important sentences, without considering the meaning conveyed by those sentences. In contrast, abstractive methods need a formal semantic representation, where the selection of important components and the rephrasing of the selected components are carried out using the semantic features associated with the words as well as the context. Furthermore, a deep linguistic analysis is needed for generating summaries. However, the bottleneck behind abstractive summarization is that it requires semantic representation, inference rules and natural language generation. In this paper, The authors propose a semi-supervised bootstrapping approach for the identification of important components for abstractive summarization. The input to the proposed approach is a fully connected semantic graph of a document, where the semantic graphs are constructed for sentences, which are then connected by synonym concepts and co-referring entities to form a complete semantic graph. The direction of the traversal of nodes is determined by a modified spreading activation algorithm, where the importance of the nodes and edges are decided, based on the node and its connected edges under consideration. Summary obtained using the proposed approach is compared with extractive and template based summaries, and also evaluated using ROUGE scores.

[1]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[2]  Michael R. Berthold,et al.  Node Similarities from Spreading Activation , 2010, 2010 IEEE International Conference on Data Mining.

[3]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[4]  Madely du Preez Social Web Evolution: Integrating Semantic Applications and Web 2.0 Technologies , 2009 .

[5]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[6]  T. V. Geetha,et al.  MorphoSemantic Features for Rulebased Tamil Enconversion , 2011 .

[7]  Carl Camilleri,et al.  MultiSum: Query-Based Multi-Document Summarization , 2008, COLING 2008.

[8]  Ralf Krestel,et al.  {Using Knowledge-poor Coreference Resolution for Text Summarization} , 2003 .

[9]  Cristian Bogdan,et al.  Spreading Activation Methods , 2009 .

[10]  Breck Baldwin,et al.  Dynamic Coreference-Based Summarization , 1998, EMNLP.

[11]  Fabio Crestani,et al.  Application of Spreading Activation Techniques in Information Retrieval , 1997, Artificial Intelligence Review.

[12]  William C. Mann,et al.  Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[13]  J. Balaji,et al.  Morpho-Semantic Features for Rule-based Tamil Enconversion , 2011 .

[14]  Karel Jezek,et al.  Two uses of anaphora resolution in summarization , 2007, Inf. Process. Manag..

[15]  Inderjeet Mani,et al.  The Challenges of Automatic Summarization , 2000, Computer.

[16]  Ranjani Parthasarathi,et al.  Semantic Parsing of Tamil Sentences , 2012 .

[17]  André Freitas,et al.  Treo: Combining Entity-Search, Spreading Activation and Semantic Relatedness for Querying Linked Data , 2011 .

[18]  Ján Suchal On finding power method in spreading activation search , 2008, SOFSEM.

[19]  Mandar Mitra,et al.  FIRE: Forum for Information Retrieval Evaluation , 2008, IJCNLP.

[20]  T. V. Geetha,et al.  Graph-Based Bootstrapping for Coreference Resolution , 2014, J. Intell. Syst..

[21]  Shafiq R. Joty,et al.  Unsupervised Approach for Selecting Sentences in Query-based Summarization , 2008, FLAIRS Conference.

[22]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[23]  Iris Hendrickx,et al.  Using Coreference Links and Sentence Compression in Graph-based Summarization , 2008, TAC.

[24]  Ani Nenkova,et al.  Automatic Text Summarization of Newswire: Lessons Learned from the Document Understanding Conference , 2005, AAAI.

[25]  M R Quillian,et al.  Word concepts: a theory and simulation of some basic semantic capabilities. , 1967, Behavioral science.

[26]  T. V. Geetha,et al.  Two-Stage Bootstrapping for Anaphora Resolution , 2012, COLING.

[27]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[28]  John R. Anderson A spreading activation theory of memory. , 1983 .

[29]  Naomie Salim,et al.  A review on abstractive summarization methods , 2014 .

[30]  T. V. Geetha,et al.  A Graph Based Query Focused Multi-Document Summarization , 2014, Int. J. Intell. Inf. Technol..

[31]  Vivi Nastase,et al.  Topic-Driven Multi-Document Summarization with Encyclopedic Knowledge and Spreading Activation , 2008, EMNLP.