UniSumm: Unified Few-shot Summarization with Multi-Task Pre-Training and Prefix-Tuning

The diverse demands of different summarization tasks and their high annotation costs are driving a need for few-shot summarization. However, despite the emergence of many summarization tasks and datasets, the current training paradigm for few-shot summarization systems ignores potentially shareable knowledge in heterogeneous datasets. To this end, we propose \textsc{UniSumm}, a unified few-shot summarization model pre-trained with multiple summarization tasks and can be prefix-tuned to excel at any few-shot summarization datasets. Meanwhile, to better evaluate few-shot summarization systems, under the principles of diversity and robustness, we assemble and publicize a new benchmark \textsc{SummZoo}. It consists of $8$ diverse summarization tasks with multiple sets of few-shot samples for each task, covering both monologue and dialogue domains. Experimental results and ablation studies show that \textsc{UniSumm} outperforms strong baseline systems by a large margin across all tasks in \textsc{SummZoo} under both automatic and human evaluations. We release our code and benchmark at \url{https://github.com/microsoft/UniSumm}.

[1]  Yue Zhang,et al.  The Cross-lingual Conversation Summarization Challenge , 2022, ArXiv.

[2]  Yang Gao,et al.  PSP: Pre-trained Soft Prompts for Few-Shot Abstractive Summarization , 2022, COLING.

[3]  Yue Zhang,et al.  Graph Pre-training for AMR Parsing and Generation , 2022, ACL.

[4]  Sanket Vaibhav Mehta,et al.  ExT5: Towards Extreme Multi-Task Scaling for Transfer Learning , 2021, ArXiv.

[5]  Minlie Huang,et al.  PPT: Pre-trained Prompt Tuning for Few-shot Learning , 2021, ACL.

[6]  Fabio Petroni,et al.  Cutting Down on Prompts and Parameters: Simple Few-Shot Learning with Language Models , 2021, FINDINGS.

[7]  Kevin Gimpel,et al.  SummScreen: A Dataset for Abstractive Screenplay Summarization , 2021, ACL.

[8]  Zhiting Hu,et al.  A Survey of Knowledge-enhanced Text Generation , 2020, ACM Comput. Surv..

[9]  Yanyan Lan,et al.  From spoken dialogue to formal summary: An utterance rewriting for dialogue summarization , 2022, NAACL.

[10]  Mirella Lapata,et al.  Models and Datasets for Cross-Lingual Summarisation , 2022, EMNLP.

[11]  Heyan Huang,et al.  Cross-Lingual Abstractive Summarization with Limited Parallel Resources , 2021, ACL.

[12]  Brian Lester,et al.  The Power of Scale for Parameter-Efficient Prompt Tuning , 2021, EMNLP.

[13]  Diyi Yang,et al.  Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs , 2021, NAACL.

[14]  Shuyang Cao,et al.  Efficient Attentions for Long Document Summarization , 2021, NAACL.

[15]  Dragomir R. Radev,et al.  QMSum: A New Benchmark for Query-based Multi-domain Meeting Summarization , 2021, NAACL.

[16]  Pascale Fung,et al.  AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization , 2021, NAACL.

[17]  Yang Liu,et al.  MediaSum: A Large-scale Media Interview Dataset for Dialogue Summarization , 2021, NAACL.

[18]  Hong-Han Shuai,et al.  Meta-Transfer Learning for Low-Resource Abstractive Summarization , 2021, AAAI.

[19]  Shafiq R. Joty,et al.  Improving Zero and Few-Shot Abstractive Summarization with Intermediate Fine-tuning and Data Augmentation , 2020, NAACL.

[20]  C. Pal,et al.  Conditionally Adaptive Multi-Task Learning: Improving Transfer Learning in NLP Using Fewer Parameters & Less Data , 2020, ICLR.

[21]  Asli Celikyilmaz,et al.  Discourse-Aware Prompt Design for Text Generation , 2021, ArXiv.

[22]  Percy Liang,et al.  Prefix-Tuning: Optimizing Continuous Prompts for Generation , 2021, ACL.

[23]  Yang Liu,et al.  DialogSum: A Real-Life Scenario Dialogue Summarization Dataset , 2021, FINDINGS.

[24]  Mohammad Saleh,et al.  ForumSum: A Multi-Speaker Conversation Summarization Dataset , 2021, EMNLP.

[25]  Jun Xie,et al.  What Have We Achieved on Text Summarization? , 2020, EMNLP.

[26]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[27]  Ivan Titov,et al.  Few-Shot Learning for Opinion Summarization , 2020, EMNLP.

[28]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[29]  Xuedong Huang,et al.  TED: A Pretrained Unsupervised Summarization Model with Theme Modeling and Denoising , 2020, FINDINGS.

[30]  Peter J. Liu,et al.  PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization , 2019, ICML.

[31]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[32]  Richard Socher,et al.  Evaluating the Factual Consistency of Abstractive Text Summarization , 2019, EMNLP.

[33]  Aleksander Wawer,et al.  SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization , 2019, EMNLP.

[34]  Vladimir Eidelman,et al.  BillSum: A Corpus for Automatic Summarization of US Legislation , 2019, EMNLP.

[35]  Mirella Lapata,et al.  Text Summarization with Pretrained Encoders , 2019, EMNLP.

[36]  Joel R. Tetreault,et al.  This Email Could Save Your Life: Introducing the Task of Email Subject Line Generation , 2019, ACL.

[37]  Mirella Lapata,et al.  Single Document Summarization as Tree Induction , 2019, NAACL.

[38]  Dragomir R. Radev,et al.  Multi-News: A Large-Scale Multi-Document Summarization Dataset and Abstractive Hierarchical Model , 2019, ACL.

[39]  Alex Wang,et al.  Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling , 2018, ACL.

[40]  Gunhee Kim,et al.  Abstractive Summarization of Reddit Posts with Multi-level Memory Networks , 2018, NAACL.

[41]  William Yang Wang,et al.  WikiHow: A Large Scale Text Summarization Dataset , 2018, ArXiv.

[42]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[43]  Franck Dernoncourt,et al.  A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[44]  Qiang Yang,et al.  An Overview of Multi-task Learning , 2018 .

[45]  Sebastian Ruder,et al.  An Overview of Multi-Task Learning in Deep Neural Networks , 2017, ArXiv.

[46]  Bowen Zhou,et al.  Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond , 2016, CoNLL.

[47]  Yoshua Bengio,et al.  Practical Recommendations for Gradient-Based Training of Deep Architectures , 2012, Neural Networks: Tricks of the Trade.

[48]  Lise Getoor,et al.  Collective Classification in Network Data , 2008, AI Mag..

[49]  Mehryar Mohri,et al.  Sample Selection Bias Correction Theory , 2008, ALT.

[50]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[51]  Chin-Yew Lin,et al.  ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.

[52]  Elizabeth Shriberg,et al.  The ICSI Meeting Recorder Dialog Act (MRDA) Corpus , 2004, SIGDIAL Workshop.