SRRank: Leveraging Semantic Roles for Extractive Multi-Document Summarization

Extractive multi-document summarization systems usually rank sentences in a document set with some ranking strategy and then select a few highly ranked sentences into the summary. One of the most popular ranking algorithms is the graph-based ranking algorithm. In this paper, we investigate making use of semantic role information to enhance the graph-based ranking algorithm for multi-document summarization. We first parse the sentences and obtain the semantic roles, and then propose a novel SRRank algorithm and two extensions to make better use of the semantic role information. Our proposed algorithms can simultaneously rank the sentences, semantic roles and words in a heterogeneous ranking process. Experimental results on two DUC datasets demonstrate that our proposed algorithms significantly outperform a few baselines, and the semantic role information is validated to be very helpful for multi-document summarization.

[1]  Furu Wei,et al.  Query-sensitive mutual reinforcement chain and its application in query-oriented multi-document summarization , 2008, SIGIR '08.

[2]  Regina Barzilay,et al.  Towards Multidocument Summarization by Reformulation: Progress and Prospects , 1999, AAAI/IAAI.

[3]  Cem Aksoy,et al.  Semantic argument frequency-based multi-document summarization , 2009, 2009 24th International Symposium on Computer and Information Sciences.

[4]  Dragomir R. Radev,et al.  Centroid-based summarization of multiple documents , 2004, Inf. Process. Manag..

[5]  Wenjie Li,et al.  Developing learning strategies for topic-based summarization , 2007, CIKM '07.

[6]  Daniel Marcu,et al.  Bayesian Query-Focused Summarization , 2006, ACL.

[7]  Bernd Bohnet,et al.  Top Accuracy and Fast Dependency Parsing is not a Contradiction , 2010, COLING.

[8]  Ahmet Aker,et al.  Multi-Document Summarization Using A* Search and Discriminative Learning , 2010, EMNLP.

[9]  Inderjeet Mani,et al.  Summarizing Similarities and Differences Among Related Documents , 1997, Information Retrieval.

[10]  Tao Li,et al.  Multi-Document Summarization via the Minimum Dominating Set , 2010, COLING.

[11]  Chin-Yew Lin,et al.  From Single to Multi-document Summarization : A Prototype System and its Evaluation , 2002 .

[12]  Mark Last,et al.  A New Approach to Improving Multilingual Summarization Using a Genetic Algorithm , 2010, ACL.

[13]  Rada Mihalcea,et al.  A Language Independent Algorithm for Single and Multiple Document Summarization , 2005, IJCNLP.

[14]  Rajeev Motwani,et al.  The PageRank Citation Ranking : Bringing Order to the Web , 1999, WWW 1999.

[15]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[16]  Kam-Fai Wong,et al.  Extractive Summarization Using Supervised and Semi-Supervised Learning , 2008, COLING.

[17]  Chris H. Q. Ding,et al.  Integrating Clustering and Multi-Document Summarization by Bi-Mixture Probabilistic Latent Semantic Analysis (PLSA) with Sentence Bases , 2011, AAAI.

[18]  Pierre Nugues,et al.  Multilingual Semantic Role Labeling , 2009, CoNLL Shared Task.

[19]  Bernd Bohnet,et al.  Very high accuracy and fast dependency parsing is not a contradiction , 2010, COLING 2010.

[20]  Chun Chen,et al.  Document Summarization Based on Data Reconstruction , 2012, AAAI.

[21]  Jie Tang,et al.  Multi-topic Based Query-Oriented Summarization , 2009, SDM.

[22]  Yan Liu,et al.  Query-Oriented Multi-Document Summarization via Unsupervised Deep Learning , 2012, AAAI.

[23]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[24]  Xiaojun Wan Using only cross-document relationships for both generic and topic-focused multi-document summarizations , 2007, Information Retrieval.

[25]  Manuel J. Maña López,et al.  Multidocument summarization: An added value to clustering in interactive retrieval , 2004, TOIS.

[26]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[27]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[28]  Gilad Mishne,et al.  Finding high-quality content in social media , 2008, WSDM '08.

[29]  Furu Wei,et al.  A document-sensitive graph model for multi-document summarization , 2010, Knowledge and Information Systems.

[30]  Judith L. Klavans,et al.  Columbia Newsblaster: Multilingual News Summarization on the Web , 2004, NAACL.

[31]  Ercan Canhasi,et al.  SEMANTIC ROLE FRAMES GRAPH-BASED MULTIDOCUMENT SUMMARIZATION , 2011 .

[32]  Hiroya Takamura,et al.  Text summarization model based on the budgeted median problem , 2009, CIKM.

[33]  Frank Schilder,et al.  FastSum: Fast and Accurate Query-based Multi-document Summarization , 2008, ACL.

[34]  Dekang Lin,et al.  An Information-Theoretic Definition of Similarity , 1998, ICML.

[35]  Chris H. Q. Ding,et al.  Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization , 2008, SIGIR '08.

[36]  Xiaojun Wan,et al.  Manifold-Ranking Based Topic-Focused Multi-Document Summarization , 2007, IJCAI.

[37]  Dragomir R. Radev,et al.  LexPageRank: Prestige in Multi-Document Text Summarization , 2004, EMNLP.

[38]  Ahmet Aker,et al.  Multi-document summarization using A * search and discriminative training , 2013 .

[39]  Dragomir R. Radev,et al.  NewsInEssence: summarizing online news topics , 2005, Commun. ACM.

[40]  Regina Barzilay,et al.  Information Fusion in the Context of Multi-Document Summarization , 1999, ACL.

[41]  Hua Li,et al.  Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[42]  Eduard H. Hovy,et al.  From Single to Multi-document Summarization , 2002, ACL.