论文信息 - Self-Supervised Learning for Contextualized Extractive Summarization

Self-Supervised Learning for Contextualized Extractive Summarization

Existing models for extractive summarization are usually trained from scratch with a cross-entropy loss, which does not explicitly capture the global context at the document level. In this paper, we aim to improve this task by introducing three auxiliary pre-training tasks that learn to capture the document-level context in a self-supervised fashion. Experiments on the widely-used CNN/DM dataset validate the effectiveness of the proposed auxiliary tasks. Furthermore, we show that after pre-training, a clean model with simple building blocks is able to outperform previous state-of-the-art that are carefully designed.

[1] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[2] Alexei A. Efros,et al. Unsupervised Visual Representation Learning by Context Prediction , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[4] Christopher Joseph Pal,et al. Learning General Purpose Distributed Sentence Representations via Large Scale Multi-task Learning , 2018, ICLR.

[5] Bowen Zhou,et al. Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[6] Bowen Zhou,et al. SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[7] Nan Hua,et al. Universal Sentence Encoder for English , 2018, EMNLP.

[8] Honglak Lee,et al. An efficient framework for learning sentence representations , 2018, ICLR.

[9] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[10] Luke S. Zettlemoyer,et al. Deep Contextualized Word Representations , 2018, NAACL.

[11] Jade Goldstein-Stewart,et al. The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[12] Mohit Bansal,et al. Shortcut-Stacked Sentence Encoders for Multi-Domain Inference , 2017, RepEval@EMNLP.

[13] Christopher D. Manning,et al. Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[14] Bowen Zhou,et al. A Structured Self-attentive Sentence Embedding , 2017, ICLR.

[15] Abhinav Gupta,et al. Unsupervised Learning of Visual Representations Using Videos , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[16] Ryan T. McDonald. A Study of Global Inference Algorithms in Multi-document Summarization , 2007, ECIR.

[17] William Yang Wang,et al. Self-Supervised Dialogue Learning , 2019, ACL.

[18] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[19] Ming Zhou,et al. Ranking with Recursive Neural Networks and Its Application to Multi-Document Summarization , 2015, AAAI.

[20] Matteo Pagliardini,et al. Unsupervised Learning of Sentence Embeddings Using Compositional n-Gram Features , 2017, NAACL.

[21] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[22] Phil Blunsom,et al. Teaching Machines to Read and Comprehend , 2015, NIPS.

[23] Zhang Zuping,et al. A Hierarchical Structured Self-Attentive Model for Extractive Document Summarization (HSSAS) , 2018, IEEE Access.

[24] Mirella Lapata,et al. Neural Latent Extractive Document Summarization , 2018, EMNLP.

[25] Daisuke Okanohara Jun. A Discriminative Language Model with Pseudo-Negative Samples , 2007 .

[26] Tiejun Zhao,et al. Neural Document Summarization by Jointly Learning to Score and Select Sentences , 2018, ACL.

[27] Jitendra Malik,et al. Learning to See by Moving , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[28] Ming-Hsuan Yang,et al. Unsupervised Representation Learning by Sorting Sequences , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[29] Rada Mihalcea,et al. TextRank: Bringing Order into Text , 2004, EMNLP.

[30] Rajat Raina,et al. Self-taught learning: transfer learning from unlabeled data , 2007, ICML '07.

[31] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.