论文信息 - Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization

Summary Cloze: A New Task for Content Selection in Topic-Focused Summarization

A key challenge in topic-focused summarization is determining what information should be included in the summary, a problem known as content selection. In this work, we propose a new method for studying content selection in topic-focused summarization called the summary cloze task. The goal of the summary cloze task is to generate the next sentence of a summary conditioned on the beginning of the summary, a topic, and a reference document(s). The main challenge is deciding what information in the references is relevant to the topic and partial summary and should be included in the summary. Although the cloze task does not address all aspects of the traditional summarization problem, the more narrow scope of the task allows us to collect a large-scale datset of nearly 500k summary cloze instances from Wikipedia. We report experimental results on this new dataset using various extractive models and a two-step abstractive model that first extractively selects a small number of sentences and then abstractively summarizes them. Our results show that the topic and partial summary help the models identify relevant content, but the task remains a significant challenge.

Dan Roth | Daniel Deutsch

[1] Mor Naaman,et al. Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[2] Hoa Trang Dang,et al. Overview of DUC 2005 , 2005 .

[3] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[4] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[5] Ilya Sutskever,et al. Language Models are Unsupervised Multitask Learners , 2019 .

[6] David Konopnicki,et al. Unsupervised Query-Focused Multi-Document Summarization using the Cross Entropy Method , 2017, SIGIR.

[7] Zhi-Hong Deng,et al. An Unsupervised Multi-Document Summarization Framework Based on Neural Document Model , 2016, COLING.

[8] Bowen Zhou,et al. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[9] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[10] Daniel Marcu,et al. Bayesian Query-Focused Summarization , 2006, ACL.

[11] Dragomir R. Radev,et al. Using Random Walks for Question-focused Sentence Retrieval , 2005, HLT.