Applying two-level reinforcement ranking in query-oriented multidocument summarization

Sentence ranking is the issue of most concern in document summarization today. While traditional featurebased approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graphbased ranking algorithms (such as the PageRank-like algorithms) recursively compute sentence significance using the global information in a text graph that links sentences together. In general, the existing PageRank-like algorithms can model well the phenomena that a sentence is important if it is linked by many other important sentences. Or they are capable of modeling the mutual reinforcement among the sentences in the text graph. However, when dealing with multidocument summarization these algorithms often assemble a set of documents into one large file. The document dimension is totally ignored. In this article we present a framework to model the two-level mutual reinforcement among sentences as well as documents. Under this framework we design and develop a novel ranking algorithm such that the document reinforcement is taken into account in the process of sentence ranking.The convergence issue is examined. We also explore an interesting and important property of the proposed algorithm.When evaluated on the DUC 2005 and 2006 query-oriented multidocument summarization datasets, significant results are achieved.

[1]  Paul Over,et al.  DUC in context , 2007, Inf. Process. Manag..

[2]  Dragomir R. Radev,et al.  LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[3]  Xiaojun Wan,et al.  Using Cross-Document Random Walks for Topic-Focused Multi-Document , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[4]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[5]  Wenjie Li,et al.  Developing learning strategies for topic-based summarization , 2007, CIKM '07.

[6]  Ravikiran Vadlapudi,et al.  Automatic Evaluation of Readability of Summaries , 2010 .

[7]  Xiaojun Wan,et al.  Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[8]  J. Leskovec,et al.  Learning Semantic Graph Mapping for Document Summarization , 2004 .

[9]  Hongyuan Zha,et al.  Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering , 2002, SIGIR '02.

[10]  Kam-Fai Wong,et al.  Extractive Summarization Using Supervised and Semi-Supervised Learning , 2008, COLING.

[11]  Makoto Haraguchi,et al.  Multiple News Articles Summarization Based on Event Reference Information , 2004, NTCIR.

[12]  Michele Banko,et al.  Event-Centric Summary Generation , 2004 .

[13]  Dragomir R. Radev,et al.  Using Random Walks for Question-focused Sentence Retrieval , 2005, HLT.

[14]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[15]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[16]  Qin Lu,et al.  Extractive Summarization using Inter- and Intra- Event Relevance , 2006, ACL.

[17]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[18]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[19]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[20]  Rada Mihalcea,et al.  Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization , 2004, ACL.

[21]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[22]  Rada Mihalcea,et al.  Language Independent Extractive Summarization , 2005, ACL.

[23]  Furu Wei,et al.  A Cluster-Sensitive Graph Model for Query-Oriented Multi-document Summarization , 2008, ECIR.

[24]  Tat-Seng Chua,et al.  NUS at DUC 2007: Using Evolutionary Models of Text , 2007 .