论文信息 - Simple Conversational Data Augmentation for Semi-supervised Abstractive Dialogue Summarization - 字舞流文

Simple Conversational Data Augmentation for Semi-supervised Abstractive Dialogue Summarization

Abstractive conversation summarization has received growing attention while most current state-of-the-art summarization models heavily rely on human-annotated summaries. To reduce the dependence on labeled summaries, in this work, we present a simple yet effective set of Conversational Data Augmentation (CODA) methods for semi-supervised abstractive conversation summarization, such as random swapping/deletion to perturb the discourse relations inside conversations, dialogue-acts-guided insertion to interrupt the development of conversations, and conditional-generation-based substitution to substitute utterances with their paraphrases generated based on the conversation context. To further utilize unlabeled conversations, we combine CODA with two-stage noisy self-training where we first pre-train the summarization model on unlabeled conversations with pseudo summaries and then fine-tune it on labeled conversations. Experiments conducted on the recent conversation summarization datasets demonstrate the effectiveness of our methods over several state-of-the-art data augmentation baselines.

Diyi Yang | Jiaao Chen | Diyi Yang | Jiaao Chen | Jiaao Chen

[1] Colin Raffel,et al. An Empirical Survey of Data Augmentation for Limited Data Learning in NLP , 2021, TACL.

[2] Eduard Hovy,et al. A Survey of Data Augmentation Approaches for NLP , 2021, FINDINGS.

[3] Diyi Yang,et al. Structure-Aware Abstractive Conversation Summarization via Discourse and Action Graphs , 2021, NAACL.

[4] Pascale Fung,et al. AdaptSum: Towards Low-Resource Domain Adaptation for Abstractive Summarization , 2021, NAACL.

[5] Xiaocheng Feng,et al. Dialogue Discourse-Aware Graph Convolutional Networks for Abstractive Meeting Summarization , 2020, ArXiv.

[6] Xiaocheng Feng,et al. Incorporating Commonsense Knowledge into Abstractive Dialogue Summarization via Heterogeneous Graph Networks , 2020, CCL.

[7] Kevin Gimpel,et al. Controllable Paraphrasing and Translation with a Syntactic Exemplar , 2020, ArXiv.

[8] Teruko Mitamura,et al. GenAug: Data Augmentation for Finetuning Text Generators , 2020, DEELIO.

[9] Diyi Yang,et al. Local Additivity Based Data Augmentation for Semi-supervised NER , 2020, EMNLP.

[10] Diyi Yang,et al. Multi-View Sequence-to-Sequence Models with Conversational Structure for Abstractive Dialogue Summarization , 2020, EMNLP.

[11] Yelong Shen,et al. A Simple but Tough-to-Beat Data Augmentation Approach for Natural Language Understanding and Generation , 2020, ArXiv.

[12] Diyi Yang,et al. MixText: Linguistically-Informed Interpolation of Hidden Space for Semi-Supervised Text Classification , 2020, ACL.

[13] Xuedong Huang,et al. A Hierarchical Network for Abstractive Meeting Summarization with Cross-Domain Pretraining , 2020, FINDINGS.

[14] Eunah Cho,et al. Data Augmentation using Pre-trained Transformer Models , 2020, LIFELONGNLP.

[15] Aleksander Wawer,et al. SAMSum Corpus: A Human-annotated Dialogue Dataset for Abstractive Summarization , 2019, EMNLP.

[16] Quoc V. Le,et al. Self-Training With Noisy Student Improves ImageNet Classification , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Sugato Basu,et al. Semi-Supervised Learning , 2019, Encyclopedia of Database Systems.

[18] Omer Levy,et al. BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension , 2019, ACL.

[19] Nancy F. Chen,et al. Topic-Aware Pointer-Generator Networks for Summarizing Spoken Conversations , 2019, 2019 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU).

[20] Marc'Aurelio Ranzato,et al. Revisiting Self-Training for Neural Sequence Generation , 2019, ICLR.

[21] Mohit Bansal,et al. Automatically Learning Data Augmentation Policies for Dialogue Tasks , 2019, EMNLP.

[22] T. Goldstein,et al. FreeLB: Enhanced Adversarial Training for Natural Language Understanding , 2019, ICLR.

[23] Jieping Ye,et al. Automatic Dialogue Summary Generation for Customer Service , 2019, KDD.

[24] Heng Ji,et al. Keep Meeting Summaries on Topic: Abstractive Multi-Modal Meeting Summarization , 2019, ACL.

[25] Kevin Gimpel,et al. Variational Sequential Labelers for Semi-Supervised Learning , 2019, EMNLP.

[26] Noah A. Smith,et al. Variational Pretraining for Semi-supervised Text Classification , 2019, ACL.

[27] Min Yang,et al. Abstractive Meeting Summarization via Hierarchical Adaptive Segmental Network Learning , 2019, WWW.

[28] Quoc V. Le,et al. Unsupervised Data Augmentation for Consistency Training , 2019, NeurIPS.

[29] Kilian Q. Weinberger,et al. BERTScore: Evaluating Text Generation with BERT , 2019, ICLR.

[30] Joel R. Tetreault,et al. Dialogue Act Classification with Context-Aware Self-Attention , 2019, NAACL.

[31] Savitha Ramasamy,et al. Fast Prototyping a Dialogue Comprehension System for Nurse-Patient Conversations on Symptom Monitoring , 2019, NAACL.

[32] Kai Zou,et al. EDA: Easy Data Augmentation Techniques for Boosting Performance on Text Classification Tasks , 2019, EMNLP.

[33] Quan Z. Sheng,et al. Adversarial Attacks on Deep Learning Models in Natural Language Processing: A Survey , 2019 .

[34] Quoc V. Le,et al. Semi-Supervised Sequence Modeling with Cross-View Training , 2018, EMNLP.

[35] Yun-Nung Chen,et al. Abstractive Dialogue Summarization with Sentence-Gated Modeling Optimized by Dialogue Acts , 2018, 2018 IEEE Spoken Language Technology Workshop (SLT).

[36] Mohit Bansal,et al. Adversarial Over-Sensitivity and Over-Stability Strategies for Dialogue Models , 2018, CoNLL.

[37] Mirella Lapata,et al. Don’t Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization , 2018, EMNLP.

[38] Sosuke Kobayashi,et al. Contextual Augmentation: Data Augmentation by Words with Paradigmatic Relations , 2018, NAACL.

[39] Luke S. Zettlemoyer,et al. Adversarial Example Generation with Syntactically Controlled Paraphrase Networks , 2018, NAACL.

[40] Amita Misra,et al. Using Summarization to Discover Argument Facets in Online Idealogical Dialog , 2017, NAACL.

[41] Percy Liang,et al. Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[42] Lu Wang,et al. Joint Modeling of Content and Discourse Relations in Dialogues , 2017, ACL.

[43] Shin Ishii,et al. Virtual Adversarial Training: A Regularization Method for Supervised and Semi-Supervised Learning , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[44] Zhiting Hu,et al. Improved Variational Autoencoders for Text Modeling using Dilated Convolutions , 2017, ICML.

[45] Timo Aila,et al. Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[46] Terry K Koo,et al. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. , 2016, Journal of chiropractic medicine.

[47] Rico Sennrich,et al. Improving Neural Machine Translation Models with Monolingual Data , 2015, ACL.

[48] Tapani Raiko,et al. Semi-supervised Learning with Ladder Networks , 2015, NIPS.

[49] Claire Cardie,et al. Domain-Independent Abstract Generation for Focused Meeting Summarization , 2013, ACL.

[50] Xiaojin Zhu,et al. Introduction to Semi-Supervised Learning , 2009, Synthesis Lectures on Artificial Intelligence and Machine Learning.

[51] Johanna D. Moore,et al. Incorporating Speaker and Discourse Features into Speech Summarization , 2006, NAACL.

[52] Ellen Riloff,et al. Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[53] David Yarowsky,et al. Unsupervised Word Sense Disambiguation Rivaling Supervised Methods , 1995, ACL.

[54] Robert Dale,et al. Computational Interpretations of the Gricean Maxims in the Generation of Referring Expressions , 1995, Cogn. Sci..

[55] E. Schegloff,et al. A simplest systematics for the organization of turn-taking for conversation , 1974 .

[56] H. J. Scudder,et al. Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.

[57] Ryan McDonald,et al. Planning with Entity Chains for Abstractive Summarization , 2021, ArXiv.

[58] James F. Allen,et al. Draft of DAMSL Dialog Act Markup in Several Layers , 2007 .

[59] Elizabeth Shriberg,et al. Switchboard SWBD-DAMSL shallow-discourse-function annotation coders manual , 1997 .