Abstractive Text Summarization with Multi-Head Attention

In this paper, we present a novel sequence-to-sequence architecture with multi-head attention for automatic summarization of long text. Summaries generated by previous abstractive methods have the problems of duplicate and missing original information commonly. To address these problems, we propose a multi-head attention summarization (MHAS) model, which uses multi-head attention mechanism to learn relevant information in different representation subspaces. The MHAS model can consider the previously predicted words when generating new words to avoid generating a summary of redundant repetition words. And it can learn the internal structure of the article by adding self-attention layer to the traditional encoder and decoder and make the model better preserve the original information. We also integrate the multi-head attention distribution into pointer network creatively to improve the performance of the model. Experiments are conducted on CNN/Daily Mail dataset, which is a long text English corpora. Experimental results show that our proposed model outperforms the previous extractive and abstractive models.

[1]  Annie Louis A Bayesian Method to Incorporate Background Knowledge during Automatic Text Summarization , 2014, ACL.

[2]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[3]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[4]  Dianne P. O'Leary,et al.  Text summarization via hidden Markov models , 2001, SIGIR '01.

[5]  Bowen Zhou,et al.  A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[6]  Guy Lapalme,et al.  Fully Abstractive Approach to Guided Summarization , 2012, ACL.

[7]  Hua Li,et al.  Document Summarization Using Conditional Random Fields , 2007, IJCAI.

[8]  Rafael Dueire Lins,et al.  Assessing shallow sentence scoring techniques and combinations for single and multi-document summarization , 2016, Expert Syst. Appl..

[9]  Yidong Chen,et al.  Deep Semantic Role Labeling with Self-Attention , 2017, AAAI.

[10]  Benoît Favre,et al.  Concept-based Summarization using Integer Linear Programming: From Concept Pruning to Multiple Optimal Solutions , 2015, EMNLP.

[11]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[12]  Prasenjit Mitra,et al.  Multi-Document Abstractive Summarization Using ILP Based Multi-Sentence Compression , 2015, IJCAI.

[13]  Tao Li,et al.  Ontology-enriched multi-document summarization in disaster management using submodular function , 2013, Inf. Sci..

[14]  Giuseppe Carenini,et al.  Abstractive Summarization of Spoken and Written Conversations Based on Phrasal Queries , 2014, ACL.

[15]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[16]  Yoshua Bengio,et al.  On Using Very Large Target Vocabulary for Neural Machine Translation , 2014, ACL.

[17]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[18]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[19]  Hui Lin,et al.  Multi-document Summarization via Budgeted Maximization of Submodular Functions , 2010, NAACL.

[20]  Furu Wei,et al.  Faithful to the Original: Fact Aware Neural Abstractive Summarization , 2017, AAAI.

[21]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[22]  Luis A. Leiva Responsive text summarization , 2018, Inf. Process. Lett..

[23]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[24]  Cristina Ribeiro,et al.  Summarization of changes in dynamic text collections using Latent Dirichlet Allocation model , 2015, Inf. Process. Manag..

[25]  Min Sun,et al.  A Unified Model for Extractive and Abstractive Summarization using Inconsistency Loss , 2018, ACL.

[26]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[27]  Andreas Spanias,et al.  Attend and Diagnose: Clinical Time Series Analysis using Attention Models , 2017, AAAI.

[28]  Dongyan Zhao,et al.  Get The Point of My Utterance! Learning Towards Effective Responses with Multi-Head Attention Mechanism , 2018, IJCAI.