Doctor AI: Predicting Clinical Events via Recurrent Neural Networks

Leveraging large historical data in electronic health record (EHR), we developed Doctor AI, a generic predictive model that covers observed medical conditions and medication uses. Doctor AI is a temporal model using recurrent neural networks (RNN) and was developed and applied to longitudinal time stamped EHR data from 260K patients over 8 years. Encounter records (e.g. diagnosis codes, medication codes or procedure codes) were input to RNN to predict (all) the diagnosis and medication categories for a subsequent visit. Doctor AI assesses the history of patients to make multilabel predictions (one label for each diagnosis or medication category). Based on separate blind test set evaluation, Doctor AI can perform differential diagnosis with up to 79% recall@30, significantly higher than several baselines. Moreover, we demonstrate great generalizability of Doctor AI by adapting the resulting models from one institution to another without losing substantial accuracy.

[1]  N. M. Keith,et al.  Some Different Types Of Essential Hypertension: Their Course And Prognosis , 1939, The American journal of the medical sciences.

[2]  V. J. Stevens,et al.  Diabetic cataract formation: potential role of glycosylation of lens crystallins. , 1978, Proceedings of the National Academy of Sciences of the United States of America.

[3]  David Heckerman,et al.  A Tractable Inference Algorithm for Diagnosing Multiple Diseases , 2013, UAI.

[4]  Vijay Karamcheti,et al.  Sequence learning with recurrent networks: analysis of internal representations , 1992, Defense, Security, and Sensing.

[5]  M. Laakso,et al.  Essential hypertension and cognitive function. The role of hyperinsulinemia. , 1993, Hypertension.

[6]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[7]  Wendy W. Chapman,et al.  A Simple Algorithm for Identifying Negated Findings and Diseases in Discharge Summaries , 2001, J. Biomed. Informatics.

[8]  Daphne Koller,et al.  Continuous Time Bayesian Networks , 2012, UAI.

[9]  Simon G. Thompson,et al.  Multistate Markov models for disease progression with classification error , 2003 .

[10]  Uri T Eden,et al.  A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. , 2005, Journal of neurophysiology.

[11]  W. Winter,et al.  A Mechanism-based Disease Progression Model for Comparison of Long-term Effects of Pioglitazone, Metformin and Gliclazide on Disease Processes Underlying Type 2 Diabetes Mellitus , 2006, Journal of Pharmacokinetics and Pharmacodynamics.

[12]  A. Veen,et al.  Estimation of Space–Time Branching Process Models in Seismology Using an EM–Type Algorithm , 2006 .

[13]  Yohann Foucher,et al.  A semi-Markov model for multistate and interval-censored data with multiple terminal events. Application in renal transplantation. , 2007, Statistics in medicine.

[14]  Hinrich Schütze,et al.  Introduction to Information Retrieval: Scoring, term weighting, and the vector space model , 2008 .

[15]  J. Schmidhuber,et al.  A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Thomas Josef Liniger,et al.  Multivariate Hawkes processes , 2009 .

[17]  Peter Spirtes,et al.  Introduction to Causal Inference , 2010, J. Mach. Learn. Res..

[18]  T. H. Kyaw,et al.  Multiparameter Intelligent Monitoring in Intensive Care II: A public-access intensive care unit database* , 2011, Critical care medicine.

[19]  N. Tangri,et al.  A predictive model for progression of chronic kidney disease to kidney failure. , 2011, JAMA.

[20]  M. Saeed Multiparameter Intelligent Monitoring in Intensive Care II ( MIMIC-II ) : A public-access intensive care unit database , 2011 .

[21]  Sriraam Natarajan,et al.  Multiplicative Forests for Continuous-Time Processes , 2012, NIPS.

[22]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[23]  Yoshua Bengio,et al.  Unsupervised and Transfer Learning Challenge: a Deep Learning Approach , 2011, ICML Unsupervised and Transfer Learning.

[24]  Razvan Pascanu,et al.  Theano: new features and speed improvements , 2012, ArXiv.

[25]  D. Mould Models for Disease Progression: New Approaches and Uses , 2012, Clinical pharmacology and therapeutics.

[26]  Yanwei Zhang,et al.  Disease progression modeling using Hidden Markov Models , 2012, 2012 Annual International Conference of the IEEE Engineering in Medicine and Biology Society.

[27]  Jiayu Zhou,et al.  Modeling disease progression via fused sparse group lasso , 2012, KDD.

[28]  T. Lasko,et al.  Computational Phenotype Discovery Using Unsupervised Feature Learning over Noisy, Sparse, and Irregular Clinical Data , 2013, PloS one.

[29]  Matthew J. Johnson,et al.  Bayesian nonparametric hidden semi-Markov models , 2012, J. Mach. Learn. Res..

[30]  James M. Rehg,et al.  Longitudinal Modeling of Glaucoma Progression Using 2-Dimensional Continuous-Time Hidden Markov Model , 2013, MICCAI.

[31]  Pascal Vincent,et al.  Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Le Song,et al.  Learning Social Infectivity in Sparse Low-rank Networks Using Multi-dimensional Hawkes Processes , 2013, AISTATS.

[33]  Lingjiong Zhu Nonlinear Hawkes Processes , 2013, 1304.7531.

[34]  Eric P. Xing,et al.  Fast structure learning in generalized stochastic processes with latent factors , 2013, KDD.

[35]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[36]  Alex Graves,et al.  Generating Sequences With Recurrent Neural Networks , 2013, ArXiv.

[37]  Yoshua Bengio,et al.  How transferable are features in deep neural networks? , 2014, NIPS.

[38]  Wojciech Zaremba,et al.  Learning to Execute , 2014, ArXiv.

[39]  Navdeep Jaitly,et al.  Towards End-To-End Speech Recognition with Recurrent Neural Networks , 2014, ICML.

[40]  Jane M. Lange,et al.  Latent Continuous Time Markov Chains for Partially-Observed Multistate Disease Processes , 2014 .

[41]  Ruslan Salakhutdinov,et al.  Unifying Visual-Semantic Embeddings with Multimodal Neural Language Models , 2014, ArXiv.

[42]  Trevor Darrell,et al.  LSDA: Large Scale Detection through Adaptation , 2014, NIPS.

[43]  Scott W. Linderman,et al.  Discovering Latent Network Structure in Point Process Data , 2014, ICML.

[44]  Surya Ganguli,et al.  Exact solutions to the nonlinear dynamics of learning in deep linear neural networks , 2013, ICLR.

[45]  Yoshua Bengio,et al.  Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling , 2014, ArXiv.

[46]  Quoc V. Le,et al.  Sequence to Sequence Learning with Neural Networks , 2014, NIPS.

[47]  Xiang Wang,et al.  Unsupervised learning of disease progression models , 2014, KDD.

[48]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[49]  Yan Liu,et al.  Distilling Knowledge from Deep Networks with Applications to Healthcare Domain , 2015, ArXiv.

[50]  Lurdes Y. T. Inoue,et al.  A joint model for multistate disease processes and random informative observation times, with applications to electronic medical records data , 2015, Biometrics.

[51]  Fei-Fei Li,et al.  Visualizing and Understanding Recurrent Networks , 2015, ArXiv.

[52]  Hinrich Schütze,et al.  Scoring , term weighting and thevector space model , 2015 .

[53]  Adler J. Perotte,et al.  The Survival Filter: Joint Survival Analysis with a Latent Time Series , 2015, UAI.

[54]  Johannes Gehrke,et al.  Intelligible Models for HealthCare: Predicting Pneumonia Risk and Hospital 30-day Readmission , 2015, KDD.

[55]  Richard Walker,et al.  PD Disease State Assessment in Naturalistic Environments Using Deep Learning , 2015, AAAI.

[56]  Geoffrey E. Hinton,et al.  A Simple Way to Initialize Recurrent Networks of Rectified Linear Units , 2015, ArXiv.

[57]  Le Song,et al.  Constructing Disease Network and Temporal Progression Model via Context-Sensitive Hawkes Process , 2015, 2015 IEEE International Conference on Data Mining.

[58]  Thomas S. Huang,et al.  An Analysis of Unsupervised Pre-training in Light of Recent Advances , 2014, ICLR.

[59]  Yan Liu,et al.  Deep Computational Phenotyping , 2015, KDD.

[60]  Li Li,et al.  Deep Patient: An Unsupervised Representation to Predict the Future of Patients from the Electronic Health Records , 2016, Scientific Reports.

[61]  Charles Elkan,et al.  Learning to Diagnose with LSTM Recurrent Neural Networks , 2015, ICLR.

[62]  Xinyun Chen Under Review as a Conference Paper at Iclr 2017 Delving into Transferable Adversarial Ex- Amples and Black-box Attacks , 2016 .

[63]  Jimeng Sun,et al.  Multi-layer Representation Learning for Medical Concepts , 2016, KDD.

[64]  David Sontag,et al.  Learning Low-Dimensional Representations of Medical Concepts , 2016, CRI.