Simple Conversational Data Augmentation for Semi-supervised Abstractive Dialogue Summarization

Abstractive conversation summarization has received growing attention while most current state-of-the-art summarization models heavily rely on human-annotated summaries. To reduce the dependence on labeled summaries, in this work, we present a simple yet effective set of Conversational Data Augmentation (CODA) methods for semi-supervised abstractive conversation summarization, such as random swapping/deletion to perturb the discourse relations inside conversations, dialogue-acts-guided insertion to interrupt the development of conversations, and conditional-generation-based substitution to substitute utterances with their paraphrases generated based on the conversation context. To further utilize unlabeled conversations, we combine CODA with two-stage noisy self-training where we first pre-train the summarization model on unlabeled conversations with pseudo summaries and then fine-tune it on labeled conversations. Experiments conducted on the recent conversation summarization datasets demonstrate the effectiveness of our methods over several state-of-the-art data augmentation baselines.

[1]  Colin Raffel,et al.  An Empirical Survey of Data Augmentation for Limited Data Learning in NLP , 2021, TACL.

[2]  Eduard Hovy,et al.  A Survey of Data Augmentation Approaches for NLP , 2021, FINDINGS.

[3]  Diyi Yang,et al.  Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs , 2021, NAACL.

[4]  Pascale Fung,et al.  AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization , 2021, NAACL.

[5]  Xiaocheng Feng,et al.  Dialogue Discourse-Aware Graph Convolutional Networks for Abstractive Meeting Summarization , 2020, ArXiv.

[6]  Xiaocheng Feng,et al.  Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks , 2020, CCL.

[7]  Kevin Gimpel,et al.  Controllable Paraphrasing and Translation with a Syntactic Exemplar , 2020, ArXiv.

[8]  Teruko Mitamura,et al.  GenAug: Data Augmentation for Finetuning Text Generators , 2020, DEELIO.

[9]  Diyi Yang,et al.  Local Additivity Based Data Augmentation for Semi-supervised NER , 2020, EMNLP.

[10]  Diyi Yang,et al.  Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization , 2020, EMNLP.

[11]  Yelong Shen,et al.  A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation , 2020, ArXiv.

[12]  Diyi Yang,et al.  MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification , 2020, ACL.

[13]  Xuedong Huang,et al.  A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining , 2020, FINDINGS.

[14]  Eunah Cho,et al.  Data Augmentation using Pre-trained Transformer Models , 2020, LIFELONGNLP.

[15]  Aleksander Wawer,et al.  SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization , 2019, EMNLP.

[16]  Quoc V. Le,et al.  Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Sugato Basu,et al.  Semi-Supervised Learning , 2019, Encyclopedia of Database Systems.

[18]  Omer Levy,et al.  BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[19]  Nancy F. Chen,et al.  Topic-Aware Pointer-Generator Networks for Summarizing Spoken Conversations , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[20]  Marc'Aurelio Ranzato,et al.  Revisiting Self-Training for Neural Sequence Generation , 2019, ICLR.

[21]  Mohit Bansal,et al.  Automatically Learning Data Augmentation Policies for Dialogue Tasks , 2019, EMNLP.

[22]  T. Goldstein,et al.  FreeLB: Enhanced Adversarial Training for Natural Language Understanding , 2019, ICLR.

[23]  Jieping Ye,et al.  Automatic Dialogue Summary Generation for Customer Service , 2019, KDD.

[24]  Heng Ji,et al.  Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization , 2019, ACL.

[25]  Kevin Gimpel,et al.  Variational Sequential Labelers for Semi-Supervised Learning , 2019, EMNLP.

[26]  Noah A. Smith,et al.  Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[27]  Min Yang,et al.  Abstractive Meeting Summarization via Hierarchical Adaptive Segmental Network Learning , 2019, WWW.

[28]  Quoc V. Le,et al.  Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[29]  Kilian Q. Weinberger,et al.  BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[30]  Joel R. Tetreault,et al.  Dialogue Act Classification with Context-Aware Self-Attention , 2019, NAACL.

[31]  Savitha Ramasamy,et al.  Fast Prototyping a Dialogue Comprehension System for Nurse-Patient Conversations on Symptom Monitoring , 2019, NAACL.

[32]  Kai Zou,et al.  EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[33]  Quan Z. Sheng,et al.  Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey , 2019 .

[34]  Quoc V. Le,et al.  Semi-Supervised Sequence Modeling with Cross-View Training , 2018, EMNLP.

[35]  Yun-Nung Chen,et al.  Abstractive Dialogue Summarization with Sentence-Gated Modeling Optimized by Dialogue Acts , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[36]  Mohit Bansal,et al.  Adversarial Over-Sensitivity and Over-Stability Strategies for Dialogue Models , 2018, CoNLL.

[37]  Mirella Lapata,et al.  Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[38]  Sosuke Kobayashi,et al.  Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations , 2018, NAACL.

[39]  Luke S. Zettlemoyer,et al.  Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.

[40]  Amita Misra,et al.  Using Summarization to Discover Argument Facets in Online Idealogical Dialog , 2017, NAACL.

[41]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[42]  Lu Wang,et al.  Joint Modeling of Content and Discourse Relations in Dialogues , 2017, ACL.

[43]  Shin Ishii,et al.  Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44]  Zhiting Hu,et al.  Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[45]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[46]  Terry K Koo,et al.  A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. , 2016, Journal of chiropractic medicine.

[47]  Rico Sennrich,et al.  Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[48]  Tapani Raiko,et al.  Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[49]  Claire Cardie,et al.  Domain-Independent Abstract Generation for Focused Meeting Summarization , 2013, ACL.

[50]  Xiaojin Zhu,et al.  Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[51]  Johanna D. Moore,et al.  Incorporating Speaker and Discourse Features into Speech Summarization , 2006, NAACL.

[52]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[53]  David Yarowsky,et al.  Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[54]  Robert Dale,et al.  Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[55]  E. Schegloff,et al.  A simplest systematics for the organization of turn-taking for conversation , 1974 .

[56]  H. J. Scudder,et al.  Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[57]  Ryan McDonald,et al.  Planning with Entity Chains for Abstractive Summarization , 2021, ArXiv.

[58]  James F. Allen,et al.  Draft of DAMSL Dialog Act Markup in Several Layers , 2007 .

[59]  Elizabeth Shriberg,et al.  Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .