Document-Aware Graph Models for Query-Oriented Multi-document Summarization

Sentence ranking is the issue of most concern in document summarization. In recent years, graph-based summarization models and sentence ranking algorithms have drawn considerable attention from the extractive summarization community due to their capability of recursively calculating sentence significance from the entire text graph that links sentences together rather than relying on single sentence alone. However, when dealing with multi-document summarization, existing sentence ranking algorithms often assemble a set of documents into one large file. The document dimension is ignored. In this work, we develop two alternative models to integrate the document dimension into existing sentence ranking algorithms. They are the one-layer (i.e. sentence layer) document-sensitive model and the two-layer (i.e. document and sentence layers) mutual reinforcement model. While the former implicitly incorporates the document’s influence in sentence ranking, the latter explicitly formulates the mutual reinforcement among sentence and document during ranking. The effectiveness of the proposed models and algorithms are examined on the DUC query-oriented multi-document summarization data sets.

[1]  Wenjie Li,et al.  Developing learning strategies for topic-based summarization , 2007, CIKM '07.

[2]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[3]  Jaideep Srivastava,et al.  WICER: a weighted inter-cluster edge ranking for clustered graphs , 2005, The 2005 IEEE/WIC/ACM International Conference on Web Intelligence (WI'05).

[4]  Makoto Haraguchi,et al.  Multiple News Articles Summarization Based on Event Reference Information , 2004, NTCIR.

[5]  Rada Mihalcea,et al.  Language Independent Extractive Summarization , 2005, ACL.

[6]  Furu Wei,et al.  A Cluster-Sensitive Graph Model for Query-Oriented Multi-document Summarization , 2008, ECIR.

[7]  Rada Mihalcea,et al.  Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization , 2004, ACL.

[8]  Carl D. Meyer,et al.  Deeper Inside PageRank , 2004, Internet Math..

[9]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[10]  Dragomir R. Radev,et al.  LexRank: Graph-based Centrality as Salience in Text Summarization , 2004 .

[11]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[12]  Hongyuan Zha,et al.  Generic summarization and keyphrase extraction using mutual reinforcement principle and sentence clustering , 2002, SIGIR '02.

[13]  Tat-Seng Chua,et al.  NUS at DUC 2007: Using Evolutionary Models of Text , 2007 .

[14]  Furu Wei,et al.  Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization , 2008, SIGIR '08.

[15]  Paul Over,et al.  DUC in context , 2007, Inf. Process. Manag..

[16]  Xiaojun Wan,et al.  Towards an Iterative Reinforcement Approach for Simultaneous Document Summarization and Keyword Extraction , 2007, ACL.

[17]  Taher H. Haveliwala Topic-Sensitive PageRank: A Context-Sensitive Ranking Algorithm for Web Search , 2003, IEEE Trans. Knowl. Data Eng..

[18]  Peter Ingwersen,et al.  Developing a Test Collection for the Evaluation of Integrated Search , 2010, ECIR.

[19]  Qin Lu,et al.  Extractive Summarization using Inter- and Intra- Event Relevance , 2006, ACL.

[20]  Dragomir R. Radev,et al.  Using Random Walks for Question-focused Sentence Retrieval , 2005, HLT.

[21]  Furu Wei,et al.  Applying two-level reinforcement ranking in query-oriented multidocument summarization , 2009 .

[22]  Karen Spärck Jones Automatic summarising: The state of the art , 2007, Inf. Process. Manag..

[23]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[24]  Xiaojun Wan,et al.  Using Cross-Document Random Walks for Topic-Focused Multi-Document , 2006, 2006 IEEE/WIC/ACM International Conference on Web Intelligence (WI 2006 Main Conference Proceedings)(WI'06).

[25]  Kam-Fai Wong,et al.  Extractive Summarization Using Supervised and Semi-Supervised Learning , 2008, COLING.

[26]  Michele Banko,et al.  Event-Centric Summary Generation , 2004 .

[27]  Marko Grobelnik,et al.  Learning Sub-structures of Document Semantic Graphs for Document Summarization , 2004 .

[28]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[29]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.