A System for Query Specific Coherent Text Multi-Document Summarization

In this paper, we present a system called QueSTS, which generates a query specific extractive summary of a selected set of documents. We have proposed an integrated graph approach to represent the contextual relationships among sentences of all the input documents. These relationships are exploited and several sub-graphs of the integrated graph are constructed. These sub-graphs consist of sentences that are highly relevant to the query and that are highly related to each other. These sub-graphs are ranked by a scoring model. The highest ranked sub-graph which is rich in query relevant information is selected as a query specific summary. A sentence ordering strategy has also been proposed by us to improve the coherence of the summary. Sentences in the selected summary are sequenced as per the above strategy. Experimental results show that the summaries generated by the QueSTS system are significantly better than other systems in terms of user satisfaction.

[1]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[2]  H. P. Edmundson,et al.  New Methods in Automatic Extracting , 1969, JACM.

[3]  Xiaojun Wan,et al.  Manifold-Ranking Based Topic-Focused Multi-Document Summarization , 2007, IJCAI.

[4]  Gerard Salton,et al.  Automatic Text Structuring and Summarization , 1997, Inf. Process. Manag..

[5]  Hans Peter Luhn,et al.  The Automatic Creation of Literature Abstracts , 1958, IBM J. Res. Dev..

[6]  Kathleen R. McKeown,et al.  Generating natural language summaries from multiple on-line sources , 1998 .

[7]  Mirella Lapata,et al.  Modeling Local Coherence: An Entity-Based Approach , 2005, ACL.

[8]  Eduard Hovy,et al.  Automated Text Summarization in SUMMARIST , 1997, ACL 1997.

[9]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[10]  Jihoon Yang,et al.  Extracting sentence segments for text summarization: a machine learning approach , 2000, SIGIR '00.

[11]  Jon M. Kleinberg,et al.  Automatic Resource Compilation by Analyzing Hyperlink Structure and Associated Text , 1998, Comput. Networks.

[12]  W. E. Bosma Query-based extracting: how to support the answer? , 2006 .

[13]  Vibhu O. Mittal,et al.  Ultra-summarization (poster abstract): a statistical approach to generating highly condensed non-extractive summaries , 1999, SIGIR '99.

[14]  Jade Goldstein-Stewart,et al.  The use of MMR, diversity-based reranking for reordering documents and producing summaries , 1998, SIGIR '98.

[15]  Kathleen R. McKeown,et al.  SIMFINDER: A Flexible Clustering Tool for Summarization , 2001 .

[16]  Dragomir R. Radev,et al.  LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[17]  Inderjeet Mani,et al.  Multi-Document Summarization by Graph Search and Matching , 1997, AAAI/IAAI.

[18]  Daniel Marcu,et al.  Statistics-Based Summarization - Step One: Sentence Compression , 2000, AAAI/IAAI.

[19]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[20]  Vagelis Hristidis,et al.  A system for query-specific document summarization , 2006, CIKM '06.

[21]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[22]  Regina Barzilay,et al.  Using Lexical Chains for Text Summarization , 1997 .

[23]  Peter W. Foltz,et al.  The Measurement of Textual Coherence with Latent Semantic Analysis. , 1998 .