Adapting and evaluating a deep learning language model for clinical why-question answering

Abstract Objectives To adapt and evaluate a deep learning language model for answering why-questions based on patient-specific clinical text. Materials and Methods Bidirectional encoder representations from transformers (BERT) models were trained with varying data sources to perform SQuAD 2.0 style why-question answering (why-QA) on clinical notes. The evaluation focused on: (1) comparing the merits from different training data and (2) error analysis. Results The best model achieved an accuracy of 0.707 (or 0.760 by partial match). Training toward customization for the clinical language helped increase 6% in accuracy. Discussion The error analysis suggested that the model did not really perform deep reasoning and that clinical why-QA might warrant more sophisticated solutions. Conclusion The BERT model achieved moderate accuracy in clinical why-QA and should benefit from the rapidly evolving technology. Despite the identified limitations, it could serve as a competent proxy for question-driven clinical information extraction.

[1]  Jian Peng,et al.  emrQA: A Large Corpus for Question Answering on Electronic Medical Records , 2018, EMNLP.

[2]  Hong Yu,et al.  AskHERMES: An online question answering system for complex clinical questions , 2011, J. Biomed. Informatics.

[3]  George Hripcsak,et al.  Technical Brief: Agreement, the F-Measure, and Reliability in Information Retrieval , 2005, J. Am. Medical Informatics Assoc..

[4]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[5]  Jungwei Fan,et al.  Annotating and Characterizing Clinical Sentences with Explicit Why-QA Cues , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[6]  Jian Zhang,et al.  SQuAD: 100,000+ Questions for Machine Comprehension of Text , 2016, EMNLP.

[7]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[8]  Percy Liang,et al.  Know What You Don’t Know: Unanswerable Questions for SQuAD , 2018, ACL.

[9]  Pierre Zweigenbaum,et al.  MEANS: A medical question-answering system combining NLP techniques and semantic Web technologies , 2015, Inf. Process. Manag..

[10]  M. Ebell,et al.  Analysis of questions asked by family doctors regarding patient care , 1999, BMJ.

[11]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[12]  Rodney D. Nielsen,et al.  The MiPACQ clinical question answering system. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.