Extracting relevant information from physician-patient dialogues for automated clinical note taking

We present a system for automatically extracting pertinent medical information from dialogues between clinicians and patients. The system parses each dialogue and extracts entities such as medications and symptoms, using context to predict which entities are relevant. We also classify the primary diagnosis for each conversation. In addition, we extract topic information and identify relevant utterances. This serves as a baseline for a system that extracts information from dialogues and automatically generates a patient note, which can be reviewed and edited by the clinician.

[1]  Ruslan Salakhutdinov,et al.  Evaluation methods for topic models , 2009, ICML '09.

[2]  Nan Du,et al.  Extracting Symptoms and their Status from Clinical Conversations , 2019, ACL.

[3]  E. McGlynn,et al.  The Challenge of Measuring Quality of Care From the Electronic Health Record , 2009, American journal of medical quality : the official journal of the American College of Medical Quality.

[4]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[5]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[6]  Jimeng Sun,et al.  Explainable Prediction of Medical Codes from Clinical Text , 2018, NAACL.

[7]  Nilmini Wickramasinghe,et al.  Deepr: A Convolutional Net for Medical Records , 2016, ArXiv.

[8]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[9]  Chin-Yew Lin,et al.  Data2Text Studio: Automated Text Generation from Structured Data , 2018, EMNLP.

[10]  Michael Gertz,et al.  Multilingual and cross-domain temporal tagging , 2012, Language Resources and Evaluation.

[11]  K. Thiru,et al.  Systematic review of scope and quality of electronic patient record data in primary care , 2003, BMJ : British Medical Journal.

[12]  Christine A. Sinsky,et al.  Allocation of Physician Time in Ambulatory Practice: A Time and Motion Study in 4 Specialties , 2016, Annals of Internal Medicine.

[13]  Gunnar Rätsch,et al.  An Empirical Analysis of Topic Modeling for Mining Cancer Clinical Notes , 2013, bioRxiv.

[14]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[15]  Yoshua Bengio,et al.  Learning Phrase Representations using RNN Encoder–Decoder for Statistical Machine Translation , 2014, EMNLP.

[16]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[19]  Peter J. Liu Learning to Write Notes in Electronic Health Records , 2018, ArXiv.

[20]  David Grangier,et al.  Neural Text Generation from Structured Data with Application to the Biography Domain , 2016, EMNLP.

[21]  Mark Steedman,et al.  The NXT-format Switchboard Corpus: a rich resource for investigating the syntax, semantics, pragmatics and prosody of dialogue , 2010, Lang. Resour. Evaluation.

[22]  Chunhua Weng,et al.  Methods and dimensions of electronic health record data quality assessment: enabling reuse for clinical research , 2013, J. Am. Medical Informatics Assoc..