论文信息 - Applying two-level reinforcement ranking in query-oriented multidocument summarization

Applying two-level reinforcement ranking in query-oriented multidocument summarization

Sentence ranking is the issue of most concern in document summarization today. While traditional featurebased approaches evaluate sentence significance and rank the sentences relying on the features that are particularly designed to characterize the different aspects of the individual sentences, the newly emerging graphbased ranking algorithms (such as the PageRank-like algorithms) recursively compute sentence significance using the global information in a text graph that links sentences together. In general, the existing PageRank-like algorithms can model well the phenomena that a sentence is important if it is linked by many other important sentences. Or they are capable of modeling the mutual reinforcement among the sentences in the text graph. However, when dealing with multidocument summarization these algorithms often assemble a set of documents into one large file. The document dimension is totally ignored. In this article we present a framework to model the two-level mutual reinforcement among sentences as well as documents. Under this framework we design and develop a novel ranking algorithm such that the document reinforcement is taken into account in the process of sentence ranking.The convergence issue is examined. We also explore an interesting and important property of the proposed algorithm.When evaluated on the DUC 2005 and 2006 query-oriented multidocument summarization datasets, significant results are achieved.

Furu Wei | Yanxiang He | Qin Lu | Wenjie Li

[1] Paul Over,et al. DUC in context , 2007, Inf. Process. Manag..

[2] Dragomir R. Radev,et al. LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[3] Xiaojun Wan,et al. Using Cross-Document Random Walks for Topic-Focused Multi-Document , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[4] Karen Spärck Jones. Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[5] Wenjie Li,et al. Developing learning strategies for topic-based summarization , 2007, CIKM '07.

[6] Ravikiran Vadlapudi,et al. Automatic Evaluation of Readability of Summaries , 2010 .

[7] Xiaojun Wan,et al. Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[8] J. Leskovec,et al. Learning Semantic Graph Mapping for Document Summarization , 2004 .

[9] Hongyuan Zha,et al. Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering , 2002, SIGIR '02.

[10] Kam-Fai Wong,et al. Extractive Summarization Using Supervised and Semi-Supervised Learning , 2008, COLING.

[11] Makoto Haraguchi,et al. Multiple News Articles Summarization Based on Event Reference Information , 2004, NTCIR.

[12] Michele Banko,et al. Event-Centric Summary Generation , 2004 .

[13] Dragomir R. Radev,et al. Using Random Walks for Question-focused Sentence Retrieval , 2005, HLT.

[14] Taher H. Haveliwala. Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[15] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.