论文信息 - Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading - 字舞流文

Discern: Discourse-Aware Entailment Reasoning Network for Conversational Machine Reading

Document interpretation and dialog understanding are the two major challenges for conversational machine reading. In this work, we propose Discern, a discourse-aware entailment reasoning network to strengthen the connection and enhance the understanding for both document and dialog. Specifically, we split the document into clause-like elementary discourse units (EDU) using a pre-trained discourse segmentation model, and we train our model in a weakly-supervised manner to predict whether each EDU is entailed by the user feedback in a conversation. Based on the learned EDU and entailment representations, we either reply to the user our final decision "yes/no/irrelevant" of the initial question, or generate a follow-up question to inquiry more information. Our experiments on the ShARC benchmark (blind, held-out test set) show that Discern achieves state-of-the-art results of 78.3% macro-averaged accuracy on decision making and 64.0 BLEU1 on follow-up question generation. Code and models are released at https://github.com/Yifan-Gao/Discern.

Shafiq R. Joty | Jingjing Li | Michael R. Lyu | Irwin King | Caiming Xiong | Shafiq Joty | Steven C.H. Hoi | Chien-Sheng Wu | Yifan Gao | Caiming Xiong | Irwin King | S. Hoi | M. Lyu | Chien-Sheng Wu | Yifan Gao | Jingjing Li

[1] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[2] Shafiq R. Joty,et al. CODRA: A Novel Discriminative Framework for Rhetorical Analysis , 2015, CL.

[3] Zhe Gan,et al. Discourse-Aware Neural Extractive Model for Text Summarization , 2019, ArXiv.

[4] Kevin Lin,et al. Reasoning Over Paragraph Effects in Situations , 2019, MRQA@EMNLP.

[5] Omer Levy,et al. RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[6] Piji Li,et al. Interconnected Question Generation with Coreference Alignment and Conversation Flow Modeling , 2019, ACL.

[7] R'emi Louf,et al. HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[8] Jonathan Berant,et al. oLMpics-On What Language Model Pre-training Captures , 2019, Transactions of the Association for Computational Linguistics.

[9] Todor Mihaylov,et al. Discourse-Aware Semantic Self-Attention for Narrative Reading Comprehension , 2019, EMNLP.

[10] Guillaume Bouchard,et al. Interpretation of Natural Language Rules in Conversational Machine Reading , 2018, EMNLP.

[11] Shafiq R. Joty,et al. Discourse Analysis and Its Applications , 2019, ACL.

[12] Abhishek Sharma,et al. Neural Conversational QA: Learning to Reason vs Exploiting Patterns , 2019, EMNLP.

[13] Mathias Niepert,et al. Attending to Future Tokens for Bidirectional Sequence Generation , 2019, EMNLP.

[14] Luke Zettlemoyer,et al. E3: Entailment-driven Extracting and Editing for Conversational Machine Reading , 2019, ACL.

[15] Po-Sen Huang,et al. Discourse-Aware Neural Rewards for Coherent Text Generation , 2018, NAACL.

[16] Xiaodong Liu,et al. Unified Language Model Pre-training for Natural Language Understanding and Generation , 2019, NeurIPS.

[17] William C. Mann,et al. Rhetorical Structure Theory: Toward a functional theory of text organization , 1988 .

[18] Jason Weston,et al. Tracking the World State with Recurrent Entity Networks , 2016, ICLR.

[19] Salim Roukos,et al. Bleu: a Method for Automatic Evaluation of Machine Translation , 2002, ACL.

[20] Michael R. Lyu,et al. EMT: Explicit Memory Tracker with Coarse-to-Fine Reasoning for Conversational Machine Reading , 2020, ACL.

[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22] Franck Dernoncourt,et al. A Discourse-Aware Attention Model for Abstractive Summarization of Long Documents , 2018, NAACL.

[23] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[24] Yu Cheng,et al. Discourse-Aware Neural Extractive Text Summarization , 2020, ACL.

[25] Maria Leonor Pacheco,et al. of the Association for Computational Linguistics: , 2001 .

[26] Shafiq R. Joty,et al. A Unified Linear-Time Framework for Sentence-Level Discourse Parsing , 2019, ACL.

[27] Jing Li,et al. SegBot: A Generic Neural Text Segmentation Model with Pointer Network , 2018, IJCAI.

[28] Lidong Bing,et al. Improving Question Generation With to the Point Context , 2019, EMNLP.