Bidirectional Attentional Encoder-Decoder Model and Bidirectional Beam Search for Abstractive Summarization

Sequence generative models with RNN variants, such as LSTM, GRU, show promising performance on abstractive document summarization. However, they still have some issues that limit their performance, especially while deal-ing with long sequences. One of the issues is that, to the best of our knowledge, all current models employ a unidirectional decoder, which reasons only about the past and still limited to retain future context while giving a prediction. This makes these models suffer on their own by generating unbalanced outputs. Moreover, unidirec-tional attention-based document summarization can only capture partial aspects of attentional regularities due to the inherited challenges in document summarization. To this end, we propose an end-to-end trainable bidirectional RNN model to tackle the aforementioned issues. The model has a bidirectional encoder-decoder architecture; in which the encoder and the decoder are bidirectional LSTMs. The forward decoder is initialized with the last hidden state of the backward encoder while the backward decoder is initialized with the last hidden state of the for-ward encoder. In addition, a bidirectional beam search mechanism is proposed as an approximate inference algo-rithm for generating the output summaries from the bidi-rectional model. This enables the model to reason about the past and future and to generate balanced outputs as a result. Experimental results on CNN / Daily Mail dataset show that the proposed model outperforms the current abstractive state-of-the-art models by a considerable mar-gin.

[1]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[2]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[3]  Lemao Liu,et al.  Agreement on Target-Bidirectional LSTMs for Sequence-to-Sequence Learning , 2016, AAAI.

[4]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[5]  Joelle Pineau,et al.  How NOT To Evaluate Your Dialogue System: An Empirical Study of Unsupervised Evaluation Metrics for Dialogue Response Generation , 2016, EMNLP.

[6]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[7]  Po Hu,et al.  Abstractive Document Summarization via Neural Model with Joint Attention , 2017, NLPCC.

[8]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[9]  Qing Sun,et al.  Bidirectional Beam Search: Forward-Backward Inference in Neural Sequence Models for Fill-in-the-Blank Image Captioning , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[11]  Zuping Zhang,et al.  An Enhanced Latent Semantic Analysis Approach for Arabic Document Summarization , 2018, ArXiv.

[12]  Yoram Singer,et al.  Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..

[13]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[14]  Hermann Ney,et al.  Bidirectional Decoder Networks for Attention-Based End-to-End Offline Handwriting Recognition , 2016, 2016 15th International Conference on Frontiers in Handwriting Recognition (ICFHR).

[15]  Christoph Meinel,et al.  Image Captioning with Deep Bidirectional LSTMs , 2016, ACM Multimedia.

[16]  Youngjoong Ko,et al.  How to Improve Text Summarization and Classification by Mutual Cooperation on an Integrated Framework , 2016, Expert Syst. Appl..

[17]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[18]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[19]  Zhang Zuping,et al.  A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS) , 2018, IEEE Access.