论文信息 - Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization

Unity in Diversity: Learning Distributed Heterogeneous Sentence Representation for Extractive Summarization

Automated multi-document extractive text summarization is a widely studied research problem in the field of natural language understanding. Such extractive mechanisms compute in some form the worthiness of a sentence to be included into the summary. While the conventional approaches rely on human crafted document-independent features to generate a summary, we develop a data-driven novel summary system called HNet, which exploits the various semantic and compositional aspects latent in a sentence to capture document independent features. The network learns sentence representation in a way that, salient sentences are closer in the vector space than non-salient sentences. This semantic and compositional feature vector is then concatenated with the document-dependent features for sentence ranking. Experiments on the DUC benchmark datasets (DUC-2001, DUC-2002 and DUC-2004) indicate that our model shows significant performance gain of around 1.5-2 points in terms of ROUGE score compared with the state-of-the-art baselines.

Vasudeva Varma | Manish Gupta | Abhishek Kumar Singh

[1] Christopher Potts,et al. Tree-Structured Composition in Neural Networks without Tree-Structured Architectures , 2015, CoCo@NIPS.

[2] Vasudeva Varma,et al. Hybrid MemNet for Extractive Summarization , 2017, CIKM.

[3] Ryan T. McDonald. A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[4] Ming Zhou,et al. Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization , 2015, AAAI.

[5] Eduard H. Hovy,et al. Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[6] Xiaojun Wan,et al. CTSUM: extracting more certain summaries for news articles , 2014, SIGIR.

[7] Hongyu Guo,et al. Long Short-Term Memory Over Recursive Structures , 2015, ICML.

[8] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[9] Jade Goldstein-Stewart,et al. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[10] Jiawei Han,et al. Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[11] M. Marelli,et al. SemEval-2014 Task 1: Evaluation of Compositional Distributional Semantic Models on Full Sentences through Semantic Relatedness and Textual Entailment , 2014, *SEMEVAL.