Fine-tune BERT for Extractive Summarization

BERT, a pre-trained Transformer model, has achieved ground-breaking performance on multiple NLP tasks. In this paper, we describe BERTSUM, a simple variant of BERT, for extractive summarization. Our system is the state of the art on the CNN/Dailymail dataset, outperforming the previous best-performed system by 1.65 on ROUGE-L. The codes to reproduce our results are available at this https URL

[1]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[2]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[3]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[4]  Dan Klein,et al.  Learning-Based Single-Document Summarization with Compression and Anaphoricity Constraints , 2016, ACL.

[5]  Geoffrey E. Hinton,et al.  Layer Normalization , 2016, ArXiv.

[6]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[7]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[8]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[9]  Alexander M. Rush,et al.  OpenNMT: Open-Source Toolkit for Neural Machine Translation , 2017, ACL.

[10]  Jade Goldstein-Stewart,et al.  The Use of MMR, Diversity-Based Reranking for Reordering Documents and Producing Summaries , 1998, SIGIR Forum.

[11]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[12]  Yejin Choi,et al.  Deep Communicating Agents for Abstractive Summarization , 2018, NAACL.

[13]  Mirella Lapata,et al.  Neural Latent Extractive Document Summarization , 2018, EMNLP.

[14]  Tiejun Zhao,et al.  Neural Document Summarization by Jointly Learning to Score and Select Sentences , 2018, ACL.

[15]  Mirella Lapata,et al.  Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[16]  Jackie Chi Kit Cheung,et al.  BanditSum: Extractive Summarization as a Contextual Bandit , 2018, EMNLP.

[17]  Ankur Bapna,et al.  The Best of Both Worlds: Combining Recent Advances in Neural Machine Translation , 2018, ACL.

[18]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.