Exploring Domain Shift in Extractive Text Summarization

Although domain shift has been well explored in many NLP applications, it still has received little attention in the domain of extractive text summarization. As a result, the model is under-utilizing the nature of the training data due to ignoring the difference in the distribution of training sets and shows poor generalization on the unseen domain. With the above limitation in mind, in this paper, we first extend the conventional definition of the domain from categories into data sources for the text summarization task. Then we re-purpose a multi-domain summarization dataset and verify how the gap between different domains influences the performance of neural summarization models. Furthermore, we investigate four learning strategies and examine their abilities to deal with the domain shift problem. Experimental results on three different settings show their different characteristics in our new testbed. Our source code including \textit{BERT-based}, \textit{meta-learning} methods for multi-domain summarization learning and the re-purposed dataset \textsc{Multi-SUM} will be available on our project: \url{this http URL}.

[1]  Jackie Chi Kit Cheung,et al.  Probabilistic Domain Modelling With Contextualized Distributional Semantic Vectors , 2013, ACL.

[2]  Xuanjing Huang,et al.  Meta-Learning Multi-task Communication , 2018, ArXiv.

[3]  Alexander M. Rush,et al.  Bottom-Up Abstractive Summarization , 2018, EMNLP.

[4]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[5]  Mirella Lapata,et al.  Neural Summarization by Extracting Sentences and Words , 2016, ACL.

[6]  Jackie Chi Kit Cheung,et al.  Towards Robust Abstractive Multi-Document Summarization: A Caseframe Analysis of Centrality and Domain , 2013, ACL.

[7]  Rama Chellappa,et al.  Domain adaptation for object recognition: An unsupervised approach , 2011, 2011 International Conference on Computer Vision.

[8]  Gunhee Kim,et al.  Abstractive Summarization of Reddit Posts with Multi-level Memory Networks , 2018, NAACL.

[9]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[10]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[11]  Trevor Darrell,et al.  Adapting Visual Category Models to New Domains , 2010, ECCV.

[12]  Alexei A. Efros,et al.  Unbiased look at dataset bias , 2011, CVPR 2011.

[13]  Xuanjing Huang,et al.  Adversarial Multi-task Learning for Text Classification , 2017, ACL.

[14]  Yongxin Yang,et al.  Learning to Generalize: Meta-Learning for Domain Generalization , 2017, AAAI.

[15]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[16]  Tiejun Zhao,et al.  Neural Document Summarization by Jointly Learning to Score and Select Sentences , 2018, ACL.

[17]  Yuxiang Wu,et al.  Learning to Extract Coherent Summary via Deep Reinforcement Learning , 2018, AAAI.

[18]  John Blitzer,et al.  Biographies, Bollywood, Boom-boxes and Blenders: Domain Adaptation for Sentiment Classification , 2007, ACL.

[19]  Sergey Levine,et al.  Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks , 2017, ICML.

[20]  Lu Wang,et al.  A Pilot Study of Domain Adaptation Effect for Neural Abstractive Summarization , 2017, NFiS@EMNLP.

[21]  Jackie Chi Kit Cheung,et al.  BanditSum: Extractive Summarization as a Contextual Bandit , 2018, EMNLP.

[22]  Xuanjing Huang,et al.  Searching for Effective Neural Extractive Summarization: What Works and What’s Next , 2019, ACL.

[23]  Carolyn Penstein Rosé,et al.  Multi-Domain Learning: When Do Domains Matter? , 2012, EMNLP-CoNLL.

[24]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[25]  Richard Socher,et al.  A Deep Reinforced Model for Abstractive Summarization , 2017, ICLR.

[26]  Eduard H. Hovy,et al.  Automatic Evaluation of Summaries Using N-gram Co-occurrence Statistics , 2003, NAACL.

[27]  Phil Blunsom,et al.  Teaching Machines to Read and Comprehend , 2015, NIPS.

[28]  Bowen Zhou,et al.  SummaRuNNer: A Recurrent Neural Network Based Sequence Model for Extractive Summarization of Documents , 2016, AAAI.

[29]  Chengqing Zong,et al.  Multi-domain adaptation for sentiment classification: Using multiple classifier combining methods , 2008, 2008 International Conference on Natural Language Processing and Knowledge Engineering.

[30]  Mirella Lapata,et al.  Ranking Sentences for Extractive Summarization with Reinforcement Learning , 2018, NAACL.

[31]  Kathleen McKeown,et al.  Content Selection in Deep Learning Models of Summarization , 2018, EMNLP.

[32]  Ichiro Sakata,et al.  Extractive Summarization Using Multi-Task Learning with Document Classification , 2017, EMNLP.

[33]  Mor Naaman,et al.  Newsroom: A Dataset of 1.3 Million Summaries with Diverse Extractive Strategies , 2018, NAACL.

[34]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[35]  Li Wang,et al.  A Reinforced Topic-Aware Convolutional Sequence-to-Sequence Model for Abstractive Text Summarization , 2018, IJCAI.

[36]  Furu Wei,et al.  Improving Multi-Document Summarization via Text Classification , 2016, AAAI.

[37]  悠太 菊池,et al.  大規模要約資源としてのNew York Times Annotated Corpus , 2015 .

[38]  George Trigeorgis,et al.  Domain Separation Networks , 2016, NIPS.

[39]  Benjamin Van Durme,et al.  Annotated Gigaword , 2012, AKBC-WEKEX@NAACL-HLT.

[40]  Yen-Chun Chen,et al.  Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting , 2018, ACL.

[41]  Mirella Lapata,et al.  Neural Latent Extractive Document Summarization , 2018, EMNLP.

[42]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[43]  U. Berkeley Exploring Content Models for Multi-Document Summarization , 2018 .

[44]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.