论文信息 - A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss - 字舞流文

A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss

We propose a unified model combining the strength of extractive and abstractive summarization. On the one hand, a simple extractive model can obtain sentence-level attention with high ROUGE scores but less readable. On the other hand, a more complicated abstractive model can obtain word-level dynamic attention to generate a more readable paragraph. In our model, sentence-level attention is used to modulate the word-level attention such that words in less attended sentences are less likely to be generated. Moreover, a novel inconsistency loss function is introduced to penalize the inconsistency between two levels of attentions. By end-to-end training our model with the inconsistency loss and original losses of extractive and abstractive models, we achieve state-of-the-art ROUGE scores while being the most informative and readable summarization on the CNN/Daily Mail dataset in a solid human evaluation.

Min Sun | Jing Tang | Kerui Min | Wan Ting Hsu | Chieh-Kai Lin | Ming-Ying Lee | Min Sun | Kerui Min | Jing Tang | Ming-Ying Lee | W. Hsu | Chieh-Kai Lin

[1] Mirella Lapata,et al. Neural Extractive Summarization with Side Information , 2017, ArXiv.

[2] Zhen-Hua Ling,et al. Distraction-based neural networks for modeling documents , 2016, IJCAI 2016.

[3] Hang Li,et al. “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .

[4] Navdeep Jaitly,et al. Pointer Networks , 2015, NIPS.

[5] Richard Socher,et al. A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[6] Jason Weston,et al. A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[7] Min Yang,et al. Generative Adversarial Network for Abstractive Text Summarization , 2017, AAAI.

[8] Yoshua Bengio,et al. Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[9] Angela Fan,et al. Controllable Abstractive Summarization , 2017, NMT@ACL.

[10] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[11] Bowen Zhou,et al. Classify or Select: Neural Architectures for Extractive Document Summarization , 2016, ArXiv.

[12] Mirella Lapata,et al. Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[13] Bowen Zhou,et al. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[14] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[15] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[16] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[17] Marc'Aurelio Ranzato,et al. Sequence Level Training with Recurrent Neural Networks , 2015, ICLR.

[18] Wenpeng Yin,et al. Optimizing Sentence Modeling and Selection for Document Summarization , 2015, IJCAI.

[19] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[20] Rui Zhang,et al. Graph-based Neural Multi-Document Summarization , 2017, CoNLL.

[21] Devdatt P. Dubhashi,et al. Extractive Summarization using Continuous Vector Space Models , 2014, CVSC@EACL.

[22] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[23] Phil Blunsom,et al. Language as a Latent Variable: Discrete Generative Models for Sentence Compression , 2016, EMNLP.

[24] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.