论文信息 - Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Zero-Shot Dialogue Disentanglement by Self-Supervised Entangled Response Selection

Dialogue disentanglement aims to group utterances in a long and multi-participant dialogue into threads. This is useful for discourse analysis and downstream applications such as dialogue response selection, where it can be the first step to construct a clean context/response set. Unfortunately, labeling all reply-to links takes quadratic effort w.r.t the number of utterances: an annotator must check all preceding utterances to identify the one to which the current utterance is a reply. In this paper, we are the first to propose a zero-shot dialogue disentanglement solution. Firstly, we train a model on a multi-participant response selection dataset harvested from the web which is not annotated; we then apply the trained model to perform zero-shot dialogue disentanglement. Without any labeled data, our model can achieve a cluster F1 score of 25. We also fine-tune the model using various amounts of labeled data. Experiments show that with only 10% of the data, we achieve nearly the same performance of using the full dataset1.

Alexander I. Rudnicky | Ta-Chung Chi

[1] Qiang Yang,et al. Thread detection in dynamic text message streams , 2006, SIGIR.

[2] Arman Cohan,et al. Longformer: The Long-Document Transformer , 2020, ArXiv.

[3] Ramesh Nallapati,et al. Who did They Respond to? Conversation Structure Modeling using Masked Hierarchical Transformer , 2019, AAAI.

[4] Douglas W. Oard,et al. Context-based Message Expansion for Disentanglement of Interleaved Text Conversations , 2009, NAACL.

[5] Shafiq R. Joty,et al. Response Selection for Multi-Party Conversations with Dynamic Topic Tracking , 2020, EMNLP.

[6] Tao Yu,et al. Online Conversation Disentanglement with Pointer Networks , 2020, EMNLP.

[7] Micha Elsner,et al. You Talking to Me? A Corpus and Algorithm for Conversation Disentanglement , 2008, ACL.

[8] Jatin Ganhotra,et al. A Large-Scale Corpus for Conversation Disentanglement , 2018, ACL.

[9] Wei Wang,et al. Learning to Disentangle Interleaved Conversational Threads with a Siamese Hierarchical Network and Similarity Ranking , 2018, NAACL.

[10] Micha Elsner,et al. Disentangling Chat with Local Coherence Models , 2011, ACL.

[11] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[12] Walter S. Lasecki,et al. NOESIS II: Predicting Responses, Identifying Success, and Managing Complexity in Task-Oriented Dialogue , 2019 .

[13] Li Yang,et al. Big Bird: Transformers for Longer Sequences , 2020, NeurIPS.

[14] Yiming Yang,et al. XLNet: Generalized Autoregressive Pretraining for Language Understanding , 2019, NeurIPS.

[15] Quan Liu,et al. Pre-Trained and Attention-Based Neural Networks for Building Noetic Task-Oriented Dialogue Systems , 2020, ArXiv.

[16] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[17] Zhenhua Ling,et al. DialBERT: A Hierarchical Pre-Trained Model for Conversation Disentanglement , 2020, arXiv.org.