An Approach for Combining Multiple Weighting Schemes and Ranking Methods in Graph-Based Multi-Document Summarization

Automatic text summarization aims to reduce the document text size by building a brief and voluble summary that has the most important ideas in that document. Through the years, many approaches were proposed to improve the automatic text summarization results; the graph-based method for sentence ranking is considered one of the most important approaches in this field. However, most of these approaches rely on only one weighting scheme and one ranking method, which may cause some limitations in their systems. In this paper, we focus on combining multiple graph-based approaches to improve the results of generic, extractive, and multi-document summarization. This improvement results in more accurate summaries, which could be used as a significant part of some natural language applications. We develop and experiment with two graph-based approaches that combine four weighting schemes and two ranking methods in one graph framework. To combine these methods, we propose taking the average of their results using the arithmetic mean and the harmonic mean. We evaluate our proposed approaches using DUC 2003 & DUC 2004 dataset and measure the performance using ROUGE evaluation toolkit. Our experiments demonstrate that using the harmonic mean in combining weighting schemes outperform the arithmetic mean and show a good improvement over the baselines and many state-of-the-art systems.

[1]  Piji Li,et al.  Salience Estimation via Variational Auto-Encoders for Multi-Document Summarization , 2017, AAAI.

[2]  Justin Zobel,et al.  Methods for Identifying Versioned and Plagiarized Documents , 2003, J. Assoc. Inf. Sci. Technol..

[3]  Alex A. Freitas,et al.  Document Clustering and Text Summarization , 2000 .

[4]  M. de Rijke,et al.  Sentence Relations for Extractive Summarization with Deep Neural Networks , 2018, ACM Trans. Inf. Syst..

[5]  Mohamed Abdel Fattah A hybrid machine learning model for multi-document summarization , 2013, Applied Intelligence.

[6]  Mark T. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[7]  Furu Wei,et al.  HyperSum: hypergraph based semi-supervised sentence ranking for query-oriented summarization , 2009, CIKM.

[8]  Jure Leskovec,et al.  Impact of Linguistic Analysis on the Semantic Graph Coverage and Learning of Document Extracts , 2005, AAAI.

[9]  Kathleen R. McKeown,et al.  SIMFINDER: A Flexible Clustering Tool for Summarization , 2001 .

[10]  Ming Zhou,et al.  Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization , 2015, AAAI.

[11]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[12]  Xiaojun Wan,et al.  Improved Affinity Graph Based Multi-Document Summarization , 2006, NAACL.

[13]  Elizabeth León Guzman,et al.  Extractive single-document summarization based on genetic operators and guided local search , 2014, Expert Syst. Appl..

[14]  Kai Hong,et al.  Improving the Estimation of Word Importance for News Multi-Document Summarization , 2014, EACL.

[15]  Xiaojun Wan,et al.  Multi-document summarization using cluster-based link analysis , 2008, SIGIR '08.

[16]  Kai Hong,et al.  System Combination for Multi-document Summarization , 2015, EMNLP.

[17]  Abdelghani Bellaachia,et al.  Multi-document Hyperedge-based Ranking for Text Summarization , 2014, CIKM.

[18]  Yang Liu,et al.  Using Supervised Bigram-based ILP for Extractive Summarization , 2013, ACL.

[19]  Hui Lin,et al.  A Class of Submodular Functions for Document Summarization , 2011, ACL.

[20]  Shafiq R. Joty,et al.  Improving the Performance of the Random Walk Model for Answering Complex Questions , 2008, ACL.

[21]  Rafael Dueire Lins,et al.  A multi-document summarization system based on statistics and linguistic treatment , 2014, Expert Syst. Appl..

[22]  Rui Zhang,et al.  Graph-based Neural Multi-Document Summarization , 2017, CoNLL.

[23]  Dragomir R. Radev,et al.  LexRank: Graph-based Lexical Centrality as Salience in Text Summarization , 2004, J. Artif. Intell. Res..

[24]  Rada Mihalcea,et al.  Graph-based Ranking Algorithms for Sentence Extraction, Applied to Text Summarization , 2004, ACL.

[25]  Dolf Talman,et al.  Measuring the Power of Nodes in Digraphs , 2001 .

[26]  Rada Mihalcea,et al.  TextRank: Bringing Order into Text , 2004, EMNLP.

[27]  Peng Shi,et al.  Enhancing sentence-level clustering with ranking-based clustering framework for theme-based summarization , 2014, Inf. Sci..

[28]  Daraksha Parveen,et al.  Topical Coherence for Graph-based Extractive Summarization , 2015, EMNLP.

[29]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[30]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[31]  Prasenjit Majumder,et al.  From Extractive to Abstractive Summarization: A Journey , 2019, ACL.

[32]  Xiaohua Hu,et al.  The Evaluation of Sentence Similarity Measures , 2008, DaWaK.

[33]  Francine Chen,et al.  A trainable document summarizer , 1995, SIGIR '95.

[34]  Houfeng Wang,et al.  Learning Summary Prior Representation for Extractive Summarization , 2015, ACL.

[35]  Luca Cagliero,et al.  GraphSum: Discovering correlations among multiple terms for graph-based summarization , 2013, Inf. Sci..

[36]  Furu Wei,et al.  Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization , 2018, ACL.

[37]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[38]  Eduard H. Hovy,et al.  The Automated Acquisition of Topic Signatures for Text Summarization , 2000, COLING.

[39]  Xiaojun Wan,et al.  An Exploration of Document Impact on Graph-Based Multi-Document Summarization , 2008, EMNLP.

[40]  Ming Zhou,et al.  A Redundancy-Aware Sentence Regression Framework for Extractive Summarization , 2016, COLING.

[41]  Daraksha Parveen,et al.  Integrating Importance, Non-Redundancy and Coherence in Graph-Based Extractive Summarization , 2015, IJCAI.