Machine Learning for Healthcare: On the Verge of a Major Shift in Healthcare Epidemiology

The increasing availability of electronic health data presents a major opportunity in healthcare for both discovery and practical applications to improve healthcare. However, for healthcare epidemiologists to best use these data, computational techniques that can handle large complex datasets are required. Machine learning (ML), the study of tools and methods for identifying patterns in data, can help. The appropriate application of ML to these data promises to transform patient risk stratification broadly in the field of medicine and especially in infectious diseases. This, in turn, could lead to targeted interventions that reduce the spread of healthcare-associated pathogens. In this review, we begin with an introduction to the basics of ML. We then move on to discuss how ML can transform healthcare epidemiology, providing examples of successful applications. Finally, we present special considerations for those healthcare epidemiologists who want to use and apply ML.

[1]  Jenna Wiens,et al.  Patient Risk Stratification with Time-Varying Parameters: A Multitask Learning Approach , 2016, J. Mach. Learn. Res..

[2]  D. Koller,et al.  Integration of Early Physiological Responses Predicts Later Illness Severity in Preterm Infants , 2010, Science Translational Medicine.

[3]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Pardis Sabeti,et al.  Transforming Clinical Data into Actionable Prognosis Models: Machine-Learning Framework and Field-Deployable App to Predict Outcome of Ebola Patients , 2016, PLoS neglected tropical diseases.

[5]  Jian Sun,et al.  Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Jenna Wiens,et al.  Active Learning Applied to Patient-Adaptive Heartbeat Classification , 2010, NIPS.

[7]  F. Cabitza,et al.  Unintended Consequences of Machine Learning in Medicine , 2017, JAMA.

[8]  John M. Drake,et al.  Rodent reservoirs of future zoonotic diseases , 2015, Proceedings of the National Academy of Sciences.

[9]  B. Opmeer Electronic Health Records as Sources of Research Data. , 2016, JAMA.

[10]  D. Anderson,et al.  Guidance for Infection Prevention and Healthcare Epidemiology Programs: Healthcare Epidemiologist Skills and Competencies , 2015, Infection Control & Hospital Epidemiology.

[11]  Yan Liu,et al.  An Examination of Multivariate Time Series Hashing with Applications to Health Care , 2014, 2014 IEEE International Conference on Data Mining.

[12]  Xiang Wang,et al.  Unsupervised learning of disease progression models , 2014, KDD.

[13]  Byron C. Wallace,et al.  Identifying Differences in Physician Communication Styles with a Log-Linear Transition Component Model , 2014, AAAI.

[14]  Shyam Visweswaran,et al.  Learning Instance-Specific Predictive Models , 2010, J. Mach. Learn. Res..

[15]  Ella S. Franklin,et al.  Learning Data-Driven Patient Risk Stratification Models for Clostridium difficile , 2014, Open forum infectious diseases.

[16]  Jack Parkinson,et al.  The Role of the World Bank , 1981 .

[17]  Byron C. Wallace,et al.  Extracting PICO Sentences from Clinical Trial Reports using Supervised Distant Supervision , 2016, J. Mach. Learn. Res..

[18]  M. Levy,et al.  Surviving Sepsis Campaign: International guidelines for management of severe sepsis and septic shock: 2008 , 2007, Intensive Care Medicine.

[19]  P. Pronovost,et al.  A targeted real-time early warning score (TREWScore) for septic shock , 2015, Science Translational Medicine.

[20]  Joachim Roski,et al.  Creating value in health care through big data: opportunities and policy implications. , 2014, Health affairs.

[21]  D. Bates,et al.  Big data in health care: using analytics to identify and manage high-risk and high-cost patients. , 2014, Health affairs.

[22]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[23]  Andrew Y. Ng,et al.  Cardiologist-Level Arrhythmia Detection with Convolutional Neural Networks , 2017, ArXiv.

[24]  Carla E. Brodley,et al.  Decrypting "Cryptogenic" Epilepsy: Semi-supervised Hierarchical Conditional Random Fields For Detecting Cortical Lesions In MRI-Negative Patients , 2016, J. Mach. Learn. Res..

[25]  Mitchell M. Levy,et al.  Surviving Sepsis Campaign guidelines for management of severe sepsis and septic shock , 2004, Critical care medicine.

[26]  Jenna Wiens,et al.  Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification Task , 2012, NIPS.

[27]  Ali H. Shoeb,et al.  Application of Machine Learning To Epileptic Seizure Detection , 2010, ICML.

[28]  Jenna Wiens,et al.  A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions , 2014, J. Am. Medical Informatics Assoc..

[29]  Wei Xu,et al.  Deep Recurrent Models with Fast-Forward Connections for Neural Machine Translation , 2016, TACL.

[30]  C. Sprung,et al.  Surviving Sepsis Campaign: International Guidelines for Management of Severe Sepsis and Septic Shock, 2012 , 2013, Intensive Care Medicine.

[31]  Mark Braverman,et al.  Data-Driven Decisions for Reducing Readmissions for Heart Failure: General Methodology and Case Study , 2014, PloS one.

[32]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[33]  Peter Szolovits,et al.  A Multivariate Timeseries Modeling Approach to Severity of Illness Assessment and Forecasting in ICU with Sparse, Heterogeneous Clinical Data , 2015, AAAI.

[34]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[35]  M. Ghassemi,et al.  Predicting early psychiatric readmission with natural language processing of narrative discharge summaries , 2016, Translational psychiatry.