ClinicalBERT: Modeling Clinical Notes and Predicting Hospital Readmission

Clinical notes contain information about patients that goes beyond structured data like lab values and medications. However, clinical notes have been underused relative to structured data, because notes are high-dimensional and sparse. This work develops and evaluates representations of clinical notes using bidirectional transformers (ClinicalBERT). ClinicalBERT uncovers high-quality relationships between medical concepts as judged by humans. ClinicalBert outperforms baselines on 30-day hospital readmission prediction using both discharge summaries and the first few days of notes in the intensive care unit. Code and model parameters are available.

[1]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[2]  Hongfang Liu,et al.  Natural language processing of clinical notes for identification of critical limb ischemia , 2017, Int. J. Medical Informatics.

[3]  L. Tick,et al.  Medical Language Processing: Applications to Patient Data Representation and Automatic Encoding , 1995, Methods of Information in Medicine.

[4]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[5]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[6]  Fei Wang,et al.  Readmission prediction via deep contextual embedding of clinical concepts , 2018, PloS one.

[7]  M. Funk,et al.  Alarm fatigue: a patient safety concern. , 2013, AACN advanced critical care.

[8]  Jaewoo Kang,et al.  BioBERT: a pre-trained biomedical language representation model for biomedical text mining , 2019, Bioinform..

[9]  Hongfang Liu,et al.  A Comparison of Word Embeddings for the Biomedical Natural Language Processing , 2018, J. Biomed. Informatics.

[10]  Jeffrey Dean,et al.  Scalable and accurate deep learning with electronic health records , 2018, npj Digital Medicine.

[11]  Maureen A. Smith,et al.  Documentation of Mandated Discharge Summary Components in Transitions from Acute to Subacute Care , 2008 .

[12]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[13]  Zachariah Zhang,et al.  Deep EHR: Chronic Disease Prediction Using Medical Notes , 2018, MLHC.

[14]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[15]  Isaac S Kohane,et al.  Artificial Intelligence in Healthcare , 2019, Artificial Intelligence and Machine Learning for Business for Non-Engineers.

[16]  Peter C. Austin,et al.  Effect of discharge summary availability during post-discharge visits on hospital readmission , 2002, Journal of General Internal Medicine.

[17]  Xinyuan Zhang,et al.  Multi-Label Learning from Medical Plain Text with Convolutional Residual Models , 2018, MLHC.

[18]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[19]  Wei-Hung Weng,et al.  Publicly Available Clinical BERT Embeddings , 2019, Proceedings of the 2nd Clinical Natural Language Processing Workshop.

[20]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[21]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[22]  E John Orav,et al.  Readmissions, Observation, and the Hospital Readmissions Reduction Program. , 2016, The New England journal of medicine.

[23]  Rong Jin,et al.  Understanding bag-of-words model: a statistical framework , 2010, Int. J. Mach. Learn. Cybern..

[24]  Sampo Pyysalo,et al.  How to Train good Word Embeddings for Biomedical NLP , 2016, BioNLP@ACL.

[25]  Rico Sennrich,et al.  Neural Machine Translation of Rare Words with Subword Units , 2015, ACL.

[26]  Rui Liu,et al.  Dynamic Hierarchical Classification for Patient Risk-of-Readmission , 2015, KDD.

[27]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[28]  Luke S. Zettlemoyer,et al.  Deep Contextualized Word Representations , 2018, NAACL.

[29]  Tomas Mikolov,et al.  Enriching Word Vectors with Subword Information , 2016, TACL.

[30]  Alec Radford,et al.  Improving Language Understanding by Generative Pre-Training , 2018 .

[31]  Peter Szolovits,et al.  What’s in a Note? Unpacking Predictive Value in Clinical Note Representations , 2018, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[32]  Joseph Futoma,et al.  A comparison of models for predicting early hospital readmissions , 2015, J. Biomed. Informatics.

[33]  E P Steinberg,et al.  Hospital readmissions in the Medicare population. , 1984, The New England journal of medicine.

[34]  Jürgen Schmidhuber,et al.  Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.

[35]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[36]  Kuldip K. Paliwal,et al.  Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..

[37]  Philip J Schneider,et al.  ASHP national survey of pharmacy practice in hospital settings: Prescribing and transcribing-2016. , 2017, American journal of health-system pharmacy : AJHP : official journal of the American Society of Health-System Pharmacists.

[38]  Kavishwar B. Wagholikar,et al.  Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach , 2017, BMC Medical Informatics and Decision Making.

[39]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.