Demoting the Lead Bias in News Summarization via Alternating Adversarial Learning

In news articles the lead bias is a common phenomenon that usually dominates the learning signals for neural extractive summarizers, severely limiting their performance on data with different or even no bias. In this paper, we introduce a novel technique1 to demote lead bias and make the summarizer focus more on the content semantics. Experiments on two news corpora with different degrees of lead bias show that our method can effectively demote the model’s learned lead bias and improve its generality on out-ofdistribution data, with little to no performance loss on in-distribution data.

[1]  Jackie Chi Kit Cheung,et al.  Countering the Effects of Lead Bias in News Summarization via Multi-Stage Training and Auxiliary Losses , 2019, EMNLP.

[2]  Hiroyuki Shindo,et al.  A Span Selection Model for Semantic Role Labeling , 2018, EMNLP.

[3]  Wen Xiao,et al.  Systematically Exploring Redundancy Reduction in Summarizing Long Documents , 2020, AACL.

[4]  Noah A. Smith,et al.  Topics to Avoid: Demoting Latent Confounds in Text Classification , 2019, EMNLP.

[5]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[6]  Luke Zettlemoyer,et al.  Learning to Model and Ignore Dataset Bias with Mixed Capacity Ensembles , 2020, FINDINGS.

[7]  Kai Hong,et al.  Improving the Estimation of Word Importance for News Multi-Document Summarization , 2014, EACL.

[8]  Francesco Trebbi,et al.  Improving Context Modeling in Neural Topic Segmentation , 2020, AACL.

[9]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[10]  Xuanjing Huang,et al.  A Closer Look at Data Bias in Neural Extractive Summarization Models , 2019, EMNLP.

[11]  Eduard Hovy,et al.  Earlier Isn’t Always Better: Sub-aspect Analysis on Corpus and System Biases in Summarization , 2019, EMNLP.

[12]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[13]  Giuseppe Carenini,et al.  Extractive Summarization of Long Documents by Combining Global and Local Context , 2019, EMNLP.

[14]  Xuanjing Huang,et al.  Searching for Effective Neural Extractive Summarization: What Works and What’s Next , 2019, ACL.

[15]  M. Maybury,et al.  Automatic Summarization , 2002, Computational Linguistics.

[16]  Kathleen McKeown,et al.  Content Selection in Deep Learning Models of Summarization , 2018, EMNLP.

[17]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[18]  Franck Dernoncourt,et al.  A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[19]  Luke Zettlemoyer,et al.  Don’t Take the Easy Way Out: Ensemble Based Methods for Avoiding Known Dataset Biases , 2019, EMNLP.

[20]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[21]  Xuanjing Huang,et al.  Exploring Domain Shift in Extractive Text Summarization , 2019, ArXiv.

[22]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[23]  Patrick Huber,et al.  Do We Really Need That Many Parameters In Transformer For Extractive Summarization? Discourse Can Help ! , 2020, CODI.