Soft Layer-Specific Multi-Task Summarization with Entailment and Question Generation

An accurate abstractive summary of a document should contain all its salient information and should be logically entailed by the input document. We improve these important aspects of abstractive summarization via multi-task learning with the auxiliary tasks of question generation and entailment generation, where the former teaches the summarization model how to look for salient questioning-worthy details, and the latter teaches the model how to rewrite a summary which is a directed-logical subset of the input document. We also propose novel multi-task architectures with high-level (semantic) layer-specific sharing across multiple encoder and decoder layers of the three tasks, as well as soft-sharing mechanisms (and show performance ablations and analysis examples of each contribution). Overall, we achieve statistically significant improvements over the state-of-the-art on both the CNN/DailyMail and Gigaword datasets, as well as on the DUC-2002 transfer setup. We also present several quantitative and qualitative analysis studies of our model's learned saliency and entailment skills.

[1]  Alexander M. Rush,et al.  Abstractive Sentence Summarization with Attentive Recurrent Neural Networks , 2016, NAACL.

[2]  Claire Cardie,et al.  A Sentence Compression Based Framework to Query-Focused Multi-Document Summarization , 2013, ACL.

[3]  Xinya Du,et al.  Learning to Ask: Neural Question Generation for Reading Comprehension , 2017, ACL.

[4]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[5]  Alon Lavie,et al.  Meteor Universal: Language Specific Translation Evaluation for Any Target Language , 2014, WMT@ACL.

[6]  Masaaki Nagata,et al.  RNN-based Encoder-decoder Approach with Word Frequency Estimation , 2017, ArXiv.

[7]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[8]  S. T. Buckland,et al.  An Introduction to the Bootstrap. , 1994 .

[9]  Giuseppe Carenini,et al.  Abstractive Meeting Summarization with Entailment and Fusion , 2013, ENLG.

[10]  Xinlei Chen,et al.  Microsoft COCO Captions: Data Collection and Evaluation Server , 2015, ArXiv.

[11]  Ramakanth Pasunuru,et al.  Multi-Reward Reinforced Summarization with Saliency and Entailment , 2018, NAACL.

[12]  Hal Daumé,et al.  Learning Task Grouping and Overlap in Multi-task Learning , 2012, ICML.

[13]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[14]  Kathleen McKeown,et al.  Cut and Paste Based Text Summarization , 2000, ANLP.

[15]  Ramakanth Pasunuru,et al.  Multi-Task Video Captioning with Video and Entailment Generation , 2017, ACL.

[16]  Ido Dagan,et al.  The Third PASCAL Recognizing Textual Entailment Challenge , 2007, ACL-PASCAL@ACL.

[17]  Iryna Gurevych,et al.  A Reinforcement Learning Approach for Adaptive Single- and Multi-Document Summarization , 2015, GSCL.

[18]  Zhen-Hua Ling,et al.  Distraction-based neural networks for modeling documents , 2016, IJCAI 2016.

[19]  Jiawei Han,et al.  Opinosis: A Graph Based Approach to Abstractive Summarization of Highly Redundant Opinions , 2010, COLING.

[20]  Jason Weston,et al.  A Neural Attention Model for Abstractive Sentence Summarization , 2015, EMNLP.

[21]  Yoshua Bengio,et al.  Neural Machine Translation by Jointly Learning to Align and Translate , 2014, ICLR.

[22]  Yoshimasa Tsuruoka,et al.  A Joint Many-Task Model: Growing a Neural Network for Multiple NLP Tasks , 2016, EMNLP.

[23]  Iryna Gurevych,et al.  Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps , 2017, EMNLP.

[24]  Noah A. Smith,et al.  Toward Abstractive Summarization Using Semantic Representations , 2018, NAACL.

[25]  Harish Karnick,et al.  Text Summarization using Abstract Meaning Representation , 2017, ArXiv.

[26]  Jackie Chi Kit Cheung,et al.  Unsupervised Sentence Enhancement for Automatic Summarization , 2014, EMNLP.

[27]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[28]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[29]  Navdeep Jaitly,et al.  Pointer Networks , 2015, NIPS.

[30]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[31]  Manpreet Kaur,et al.  Text Summarization through Entailment-based Minimum Vertex Cover , 2014, *SEMEVAL.

[32]  Lukasz Kaiser,et al.  Sentence Compression by Deletion with LSTMs , 2015, EMNLP.

[33]  Alice Lai,et al.  Illinois-LH: A Denotational and Distributional Approach to Semantics , 2014, *SEMEVAL.

[34]  Mihai Surdeanu,et al.  The Stanford CoreNLP Natural Language Processing Toolkit , 2014, ACL.

[35]  Lukasz Kaiser,et al.  One Model To Learn Them All , 2017, ArXiv.

[36]  Xiaojun Wan,et al.  Abstractive Document Summarization with a Graph-Based Attentional Neural Model , 2017, ACL.

[37]  Ramakanth Pasunuru,et al.  Towards Improving Abstractive Summarization via Entailment Generation , 2017, NFiS@EMNLP.

[38]  Giuseppe Carenini,et al.  Abstractive Summarization of Product Reviews Using Discourse Structure , 2014, EMNLP.

[39]  Fernando Diaz,et al.  Predicting Salient Updates for Disaster Summarization , 2015, ACL.

[40]  Alexander F. Gelbukh,et al.  UNAL-NLP: Combining Soft Cardinality Features for Semantic Textual Similarity, Relatedness and Entailment , 2014, *SEMEVAL.

[41]  Joachim Bingel,et al.  Sluice networks: Learning what to share between loosely related tasks , 2017, ArXiv.

[42]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[43]  Yonatan Belinkov,et al.  What do Neural Machine Translation Models Learn about Morphology? , 2017, ACL.

[44]  Quoc V. Le,et al.  Multi-task Sequence to Sequence Learning , 2015, ICLR.

[45]  Daniel Marcu,et al.  Summarization beyond sentence extraction: A probabilistic approach to sentence compression , 2002, Artif. Intell..

[46]  S. T. Buckland,et al.  Computer-Intensive Methods for Testing Hypotheses. , 1990 .

[47]  J Quinonero Candela,et al.  Machine Learning Challenges. Evaluating Predictive Uncertainty, Visual Object Classification, and Recognising Tectual Entailment , 2006, Lecture Notes in Computer Science.

[48]  J. Clarke,et al.  Global inference for sentence compression : an integer linear programming approach , 2008, J. Artif. Intell. Res..

[49]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[50]  Rich Caruana,et al.  Multitask Learning , 1998, Encyclopedia of Machine Learning and Data Mining.

[51]  Christopher Potts,et al.  A large annotated corpus for learning natural language inference , 2015, EMNLP.

[52]  M. Kenward,et al.  An Introduction to the Bootstrap , 2007 .

[53]  Zhen-Hua Ling,et al.  Enhanced LSTM for Natural Language Inference , 2016, ACL.

[54]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[55]  Γεώργιος Γιαννακόπουλος,et al.  Automatic Summarization from Multiple Documents , 2009 .

[56]  Martial Hebert,et al.  Cross-Stitch Networks for Multi-task Learning , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).