论文信息 - MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents

MultiDoc2Dial: Modeling Dialogues Grounded in Multiple Documents

We propose MultiDoc2Dial, a new task and dataset on modeling goal-oriented dialogues grounded in multiple documents. Most previous works treat document-grounded dialogue modeling as a machine reading comprehension task based on a single given document or passage. In this work, we aim to address more realistic scenarios where a goaloriented information-seeking conversation involves multiple topics, and hence is grounded on different documents. To facilitate such a task, we introduce a new dataset that contains dialogues grounded in multiple documents from four different domains. We also explore modeling the dialogue-based and documentbased context in the dataset. We present strong baseline approaches and various experimental results, aiming to support further research efforts on such a task.

[1] Yelong Shen,et al. Generation-Augmented Retrieval for Open-Domain Question Answering , 2020, ACL.

[2] Kshitij P. Fadnis,et al. Doc2Dial: A Framework for Dialogue Composition Grounded in Documents , 2020, AAAI.

[3] Julian Michael,et al. AmbigQA: Answering Ambiguous Open-domain Questions , 2020, EMNLP.

[4] Giuseppe Carenini,et al. Improving Unsupervised Dialogue Topic Segmentation with Utterance-Pair Coherence Scoring , 2021, SIGDIAL.

[5] Ming-Wei Chang,et al. Latent Retrieval for Weakly Supervised Open Domain Question Answering , 2019, ACL.

[6] Jon Ander Campos,et al. DoQA - Accessing Domain-Specific FAQs via Conversational QA , 2020, ACL.

[7] Jason Weston,et al. Reading Wikipedia to Answer Open-Domain Questions , 2017, ACL.

[8] Matt Post,et al. A Call for Clarity in Reporting BLEU Scores , 2018, WMT.

[9] Eunsol Choi,et al. QuAC: Question Answering in Context , 2018, EMNLP.

[10] Andrew Trotman,et al. Improvements to BM25 and Language Models Examined , 2014, ADCS.

[11] Danqi Chen,et al. CoQA: A Conversational Question Answering Challenge , 2018, TACL.

[12] W. Bruce Croft,et al. Open-Retrieval Conversational Question Answering , 2020, SIGIR.

[13] Jian Zhang,et al. SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[14] Danqi Chen,et al. Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[15] Seongho Joe,et al. Evaluation of BERT and ALBERT Sentence Embedding Performance on Downstream NLP Tasks , 2021, 2020 25th International Conference on Pattern Recognition (ICPR).

[16] Wenhan Xiong,et al. Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval , 2020, International Conference on Learning Representations.

[17] Christopher Potts,et al. Relevance-guided Supervision for OpenQA with ColBERT , 2020, Transactions of the Association for Computational Linguistics.

[18] Ming-Wei Chang,et al. REALM: Retrieval-Augmented Language Model Pre-Training , 2020, ICML.

[19] Liqiang Nie,et al. A Graph-guided Multi-round Retrieval Method for Conversational Open-domain Question Answering , 2021, ArXiv.

[20] Sachindra Joshi,et al. Does Structure Matter? Encoding Documents for Machine Reading Comprehension , 2021, NAACL.

[21] Qiang Zhou,et al. Topic Segmentation for Dialogue Stream , 2019, 2019 Asia-Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC).

[22] Luis A. Lastras,et al. Doc2Dial: A Goal-Oriented Document-Grounded Dialogue Dataset , 2020, EMNLP.

[23] Zhiyuan Liu,et al. Few-Shot Conversational Dense Retrieval , 2021, SIGIR.

[24] Edouard Grave,et al. Distilling Knowledge from Reader to Retriever for Question Answering , 2020, ArXiv.

[25] Ming-Wei Chang,et al. Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[26] W. Bruce Croft,et al. Weakly-Supervised Open-Retrieval Conversational Question Answering , 2021, ECIR.

[27] Johanna D. Moore,et al. Automatic Segmentation of Multiparty Dialogue , 2006, EACL.

[28] Soujanya Poria,et al. Retrieving and Reading: A Comprehensive Survey on Open-domain Question Answering , 2021, ArXiv.

[29] Cristina Ioana Muntean,et al. Topic Propagation in Conversational Search , 2020, SIGIR.

[30] Shafiq R. Joty,et al. Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading , 2020, EMNLP.

[31] Guillaume Bouchard,et al. Interpretation of Natural Language Rules in Conversational Machine Reading , 2018, EMNLP.

[32] Fabio Petroni,et al. Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks , 2020, NeurIPS.

[33] Jaime S. Cardoso,et al. Machine Learning Interpretability: A Survey on Methods and Metrics , 2019, Electronics.