BiSET: Bi-directional Selective Encoding with Template for Abstractive Summarization

The success of neural summarization models stems from the meticulous encodings of source articles. To overcome the impediments of limited and sometimes noisy training data, one promising direction is to make better use of the available training data by applying filters during summarization. In this paper, we propose a novel Bi-directional Selective Encoding with Template (BiSET) model, which leverages template discovered from training data to softly select key information from each source article to guide its summarization process. Extensive experiments on a standard summarization dataset are conducted and the results show that the template-equipped BiSET model manages to improve the summarization performance significantly with a new state of the art.

[1]  Xu Sun,et al.  Global Encoding for Abstractive Summarization , 2018, ACL.

[2]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[3]  Phil Blunsom,et al.  A Convolutional Neural Network for Modelling Sentences , 2014, ACL.

[4]  Benjamin Van Durme,et al.  Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[5]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[6]  Christopher D. Manning,et al.  Effective Approaches to Attention-based Neural Machine Translation , 2015, EMNLP.

[7]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[8]  Yann Dauphin,et al.  Convolutional Sequence to Sequence Learning , 2017, ICML.

[9]  Mirella Lapata,et al.  Discourse Constraints for Document Compression , 2010, CL.

[10]  Percy Liang,et al.  Generating Sentences by Editing Prototypes , 2017, TACL.

[11]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[12]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[13]  Michele Banko,et al.  Headline Generation Based on Statistical Translation , 2000, ACL.

[14]  Bowen Zhou,et al.  Pointing the Unknown Words , 2016, ACL.

[15]  Furu Wei,et al.  Retrieve, Rerank and Rewrite: Soft Template Based Neural Summarization , 2018, ACL.

[16]  Philipp Koehn,et al.  Six Challenges for Neural Machine Translation , 2017, NMT@ACL.

[17]  Jaana Kekäläinen,et al.  Cumulated gain-based evaluation of IR techniques , 2002, TOIS.

[18]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[19]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[20]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[21]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[22]  Ming Zhou,et al.  Selective Encoding for Abstractive Sentence Summarization , 2017, ACL.

[23]  Furu Wei,et al.  Faithful to the Original: Fact Aware Neural Abstractive Summarization , 2017, AAAI.

[24]  Razvan Pascanu,et al.  On the difficulty of training recurrent neural networks , 2012, ICML.

[25]  Yann Dauphin,et al.  Language Modeling with Gated Convolutional Networks , 2016, ICML.

[26]  Ali Farhadi,et al.  Bidirectional Attention Flow for Machine Comprehension , 2016, ICLR.

[27]  Danqi Chen,et al.  A Thorough Examination of the CNN/Daily Mail Reading Comprehension Task , 2016, ACL.

[28]  Eduard Hovy,et al.  Template-Filtered Headline Summarization , 2004 .

[29]  Hang Li,et al.  “ Tony ” DNN Embedding for “ Tony ” Selective Read for “ Tony ” ( a ) Attention-based Encoder-Decoder ( RNNSearch ) ( c ) State Update s 4 SourceVocabulary Softmax Prob , 2016 .