Learning Health-Bots from Training Data that was Automatically Created using Paraphrase Detection and Expert Knowledge

A key bottleneck for developing dialog models is the lack of adequate training data. Due to privacy issues, dialog data is even scarcer in the health domain. We propose a novel method for creating dialog corpora which we apply to create doctor-patient interaction data. We use this data to learn both a generation and a hybrid classification/retrieval model and find that the generation model consistently outperforms the hybrid model. We show that our data creation method has several advantages. Not only does it allow for the semi-automatic creation of large quantities of training data. It also provides a natural way of guiding learning and a novel method for assessing the quality of human-machine interactions.

[1]  Oliver Lemon,et al.  Alana: Social Dialogue using an Ensemble Model and a Ranker trained on User Feedback , 2017 .

[2]  Hannes Schulz,et al.  Frames: a corpus for adding memory to goal-oriented dialogue systems , 2017, SIGDIAL Conference.

[3]  Bing Liu,et al.  Bootstrapping a Neural Conversational Agent with Dialogue Self-Play, Crowdsourcing and On-Line Reinforcement Learning , 2018, NAACL.

[4]  Jason Weston,et al.  Personalizing Dialogue Agents: I have a dog, do you have pets too? , 2018, ACL.

[5]  Zhongyu Wei,et al.  Enhancing Dialogue Symptom Diagnosis with Global Attention and Symptom Graph , 2019, EMNLP.

[6]  Marco Guerini,et al.  Generating Challenge Datasets for Task-Oriented Conversational Agents through Self-Play , 2019, RANLP.

[7]  Jason Weston,et al.  ACUTE-EVAL: Improved Dialogue Evaluation with Optimized Questions and Multi-turn Comparisons , 2019, ArXiv.

[8]  Liang Lin,et al.  End-to-End Knowledge-Routed Relational Dialogue System for Automatic Diagnosis , 2019, AAAI.

[9]  David Vandyke,et al.  A Network-based End-to-End Trainable Task-oriented Dialogue System , 2016, EACL.

[10]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[11]  Xuanjing Huang,et al.  Task-oriented Dialogue System for Automatic Diagnosis , 2018, ACL.

[12]  Jason Weston,et al.  Wizard of Wikipedia: Knowledge-Powered Conversational agents , 2018, ICLR.

[13]  Paul Green,et al.  The Rapid Development of User Interfaces: Experience with the Wizard of OZ Method , 1985 .

[14]  Thomas Wolf,et al.  TransferTransfo: A Transfer Learning Approach for Neural Network Based Conversational Agents , 2019, ArXiv.

[15]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .