Automatic Multi Document Summarization Approaches

Problem statement: Text summarization can be of different nature ranging from indicative summary that identifies the topics of the document to informative summary which is meant to represent the concise description of the original document, providing an idea of what the whole content of document is all about. Approach: Single document summary seems to capture both the information well but it has not been the case for multi document summary where the overall comprehensive quality in presenting informative summary often lacks. It is found that most of the existing methods tend to focus on sentence scoring and less consideration is given to the contextual information content in multiple documents. Results: In this study, some survey on multi document summarization approaches has been presented. We will direct our focus notably on four well known approaches to multi document summarization namely the feature based method, cluster based method, graph based method and knowledge based method. The general ideas behind these methods have been described. Conclusion: Besides the general idea and concept, we discuss the benefits and limitations concerning these methods. With the aim of enhancing multi document summarization, specifically news documents, a novel type of approach is outlined to be developed in the future, taking into account the generic components of a news story in order to generate a better summary.

[1]  A. Kogilavani,et al.  Ontology Enhanced Clustering Based Summarization of Medical Documents , 2009 .

[2]  Zheng-Yu Niu,et al.  Multi-document Summarization Using a Clustering-Based Hybrid Strategy , 2006, AIRS.

[3]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[4]  Rakesh M. Verma,et al.  A Semantic Free-text Summarization System Using Ontology Knowledge , 2007 .

[5]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[6]  Rose Dieng,et al.  An Ontology-based Approach to Support Text Mining and Information Retrieval in the Biological Domain , 2007, J. Univers. Comput. Sci..

[7]  Hamid Khosravi,et al.  Text Summarization Based on Genetic Programming , 2009 .

[8]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[9]  Ramiz M. Aliguliyev,et al.  CLUSTERING TECHNIQUES AND DISCRETE PARTICLE SWARM OPTIMIZATION ALGORITHM FOR MULTI‐DOCUMENT SUMMARIZATION , 2010, Comput. Intell..

[10]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[11]  Xiaojun Wan,et al.  Improved Affinity Graph Based Multi-Document Summarization , 2006, NAACL.

[12]  Shanmugasundaram Hariharan,et al.  Studies on Graph Based Approaches for Singleand Multi Document Summarizations , 2009 .

[13]  Robert Wetzker,et al.  An Ontology-Based Approach to Text Summarization , 2008, 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology.

[14]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[15]  Sadeq H. Bakhy,et al.  Geometric Optimization of Three-Phalanx Prosthesis Underactuated Fingers using Particles Swarm Algorithm , 2009 .

[16]  Nor Laila Md Noor,et al.  Automating the Mapping Process of Traditional Malay Textile Knowledge Model with the Core Ontology , 2011 .

[17]  Yonggang Zhang,et al.  Co-clustering Sentences and Terms for Multi-document Summarization , 2011, CICLing.

[18]  Naomie Salim,et al.  GENETIC ALGORITHM BASED SENTENCE EXTRACTION FOR TEXT SUMMARIZATION , 2011 .

[19]  Miles Osborne,et al.  Using maximum entropy for sentence extraction , 2002, ACL 2002.

[20]  Naomie Salim,et al.  Fuzzy Logic Based Method for Improving Text Summarization , 2009, ArXiv.

[21]  Sanda M. Harabagiu,et al.  Using topic themes for multi-document summarization , 2010, TOIS.

[22]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[23]  Christophe Rodrigues,et al.  Combining a Multi-Document Update Summarization System –CBSEAS– with a Genetic Algorithm , 2011 .

[24]  V. Kamaraj,et al.  Particle Swarm Optimization Approach for Optimal Design of Switched Reluctance Machine , 2011 .

[25]  Naomie Salim,et al.  Swarm Based Text Summarization , 2009, 2009 International Association of Computer Science and Information Technology - Spring Conference.

[26]  Tao Li,et al.  Ontology-enriched multi-document summarization in disaster management , 2010, SIGIR.

[27]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[28]  Chao-Lin Liu,et al.  Ontology-based Text Summarization for Business News Articles , 2003, CATA.

[29]  Fuji Ren,et al.  GA, MR, FFNN, PNN and GMM based models for automatic text summarization , 2009, Comput. Speech Lang..

[30]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[31]  Xiaojun Wan,et al.  An Exploration of Document Impact on Graph-Based Multi-Document Summarization , 2008, EMNLP.

[32]  Dragomir R. Radev,et al.  Introduction to the Special Issue on Summarization , 2002, CL.

[33]  Mandava Rajeswari,et al.  Multimodal Integration (Image and Text) Using Ontology Alignment , 2009 .

[34]  Gurpreet Singh Lehal,et al.  A Survey of Text Summarization Extractive Techniques , 2010 .