A Path for Translation of Machine Learning Products into Healthcare Delivery

Despite enormous enthusiasm, machine learning models are rarely translated into clinical care and there is minimal evidence of clinical or economic impact. New conference venues and academic journals have emerged to promote the proliferating research; however, the translational path remains unclear. This review undertakes the first in-depth study to identify how machine learning models that ingest structured electronic health record data can be applied to clinical decision support tasks and translated into clinical practice. The authors complement their own work with the experience of 21 machine learning products that address problems across clinical domains and across geographic populations. Four phases of translation emerge: design and develop, evaluate and validate, diffuse and scale, and continuing monitoring and maintenance. The review highlights the varying approaches taken across each phase by teams building machine learning products and presents a discussion of challenges and opportunities. The translational path and associated findings are instructive to researchers and developers building machine learning products, policy makers regulating machine learning products, and health system leaders who are considering adopting a machine learning product.

[1]  Michael E Matheny,et al.  Prognostic models will be victims of their own success, unless , 2019, J. Am. Medical Informatics Assoc..

[2]  William J. Gordon,et al.  Challenges and opportunities in software-driven medical devices , 2019, Nature Biomedical Engineering.

[3]  Thomas A Lasko,et al.  A nonparametric updating method to correct clinical prediction model drift , 2019, J. Am. Medical Informatics Assoc..

[4]  Peter J Haug,et al.  Performance and utilization of an emergency department electronic screening tool for pneumonia. , 2013, JAMA internal medicine.

[5]  Suman V. Ravuri,et al.  A Clinically Applicable Approach to Continuous Prediction of Future Acute Kidney Injury , 2019, Nature.

[6]  Michael Draugelis,et al.  Clinician Perception of a Machine Learning-Based Early Warning System Designed to Predict Severe Sepsis and Septic Shock. , 2019, Critical care medicine.

[7]  R. Fluck,et al.  Standardizing the Early Identification of Acute Kidney Injury: The NHS England National Patient Safety Alert , 2015, Nephron.

[8]  Pearse A. Keane,et al.  With an eye to AI and autonomous diagnosis , 2018, npj Digital Medicine.

[9]  Jenna Wiens,et al.  A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions , 2014, J. Am. Medical Informatics Assoc..

[10]  F. Kronenberg,et al.  Multinational Assessment of Accuracy of Equations for Predicting Risk of Kidney Failure: A Meta-analysis. , 2016, JAMA.

[11]  Jenna Wiens,et al.  A Generalizable, Data-Driven Approach to Predict Daily Risk of Clostridium difficile Infection at Two Large Academic Health Centers , 2018, Infection Control & Hospital Epidemiology.

[12]  Suchi Saria,et al.  Tutorial: Safe and Reliable Machine Learning , 2019, ArXiv.

[13]  David O. Meltzer,et al.  Multicenter Comparison of Machine Learning Methods and Conventional Regression for Predicting Clinical Deterioration on the Wards , 2016, Critical care medicine.

[14]  Brian W. Powers,et al.  Dissecting racial bias in an algorithm used to manage the health of populations , 2019, Science.

[15]  M. Hornbrook,et al.  Early Colorectal Cancer Detected by Machine Learning Model Using Gender, Age, and Complete Blood Count Data , 2017, Digestive Diseases and Sciences.

[16]  Ritankar Das,et al.  Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial , 2017, BMJ Open Respiratory Research.

[17]  Bo Jin,et al.  A Real-Time Early Warning System for Monitoring Inpatient Mortality Risk: Prospective Study Using Electronic Medical Record Data , 2019, Journal of medical Internet research.

[18]  Elizabeth C. Lorenzi,et al.  Development and validation of machine learning models to identify high-risk surgical patients using automatically curated electronic health record data (Pythia): A retrospective, single-site study , 2018, PLoS medicine.

[19]  Patricia Kipnis,et al.  Early detection of impending physiologic deterioration among patients who are not in intensive care: development of predictive models using data from an automated electronic medical record. , 2012, Journal of hospital medicine.

[20]  Sarah M. Greene,et al.  Implementing the Learning Health System: From Concept to Action , 2012, Annals of Internal Medicine.

[21]  Jimeng Sun,et al.  Opportunities and challenges in developing deep learning models using electronic health records data: a systematic review , 2018, J. Am. Medical Informatics Assoc..

[22]  N. Dean,et al.  CDS in a Learning Health Care System: Identifying Physicians' Reasons for Rejection of Best-Practice Recommendations in Pneumonia through Computerized Clinical Decision Support , 2019, Applied Clinical Informatics.

[23]  S. Tamang,et al.  Potential Biases in Machine Learning Algorithms Using Electronic Health Record Data , 2018, JAMA internal medicine.

[24]  Michael Gao,et al.  Real-World Integration of a Sepsis Deep Learning Technology Into Routine Clinical Care: Implementation Study , 2019, JMIR medical informatics.

[25]  Brian Dummett Preventing Unrecognized Deterioration and Honoring Patients’ Goals of Care by Embedding an Automated Early-Warning System in Hospital Workflows , 2018 .

[26]  C. Kent The Effect of Social Media in Social Interaction , 2019 .

[27]  M. Sendak,et al.  Barriers to Achieving Economies of Scale in Analysis of EHR Data. A Cautionary Tale. , 2017, Applied clinical informatics.

[28]  P. Pronovost,et al.  A targeted real-time early warning score (TREWScore) for septic shock , 2015, Science Translational Medicine.

[29]  Luciano Floridi,et al.  Why a Right to Explanation of Automated Decision-Making Does Not Exist in the General Data Protection Regulation , 2017 .

[30]  Patrick B. Ryan,et al.  A Comparison of Data Quality Assessment Checks in Six Data Sharing Networks , 2017, EGEMS.

[31]  Ying Ma,et al.  Electronic medical record-based multicondition models to predict the risk of 30 day readmission or death among adult medicine patients: validation and comparison to existing models , 2015, BMC Medical Informatics and Decision Making.

[32]  John P. A. Ioannidis,et al.  Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review , 2017, J. Am. Medical Informatics Assoc..

[33]  B. Levin,et al.  Performance analysis of a machine learning flagging system used to identify a group of individuals at a high risk for colorectal cancer , 2017, PloS one.

[34]  Avi Goldfarb,et al.  Clinical considerations when applying machine learning to decision-support tasks versus automation , 2019, BMJ Quality & Safety.

[35]  Peter Szolovits,et al.  MIMIC-III, a freely accessible critical care database , 2016, Scientific Data.

[36]  Keith Marsolo,et al.  A longitudinal analysis of data quality in a large pediatric data research network , 2017, J. Am. Medical Informatics Assoc..

[37]  Michael J. Rothman,et al.  Development and validation of a continuous measure of patient condition using the Electronic Medical Record , 2013, J. Biomed. Informatics.

[38]  T. Holt,et al.  Evaluation of a prediction model for colorectal cancer: retrospective analysis of 2.5 million patient records , 2017, Cancer medicine.

[39]  RiemenschneiderMona,et al.  Data Science for Molecular Diagnostics Applications: From Academia to Clinic to Industry , 2018 .

[40]  Geoffrey E. Hinton Deep Learning-A Technology With the Potential to Transform Health Care. , 2018, JAMA.

[41]  I. Cohen,et al.  Big Data, Big Tech, and Protecting Patient Privacy. , 2019, JAMA.

[42]  Steven D. Woods,et al.  A Predictive Model for Progression of Chronic Kidney Disease to Kidney Failure Using a Large Administrative Claims Database , 2021, ClinicoEconomics and outcomes research : CEOR.

[43]  Shamim Nemati,et al.  An Interpretable Machine Learning Model for Accurate Prediction of Sepsis in the ICU , 2017, Critical care medicine.

[44]  Uli K. Chettipally,et al.  Multicentre validation of a sepsis prediction algorithm using only vital sign data in the emergency department, general ward and ICU , 2018, BMJ Open.

[45]  Jie Xu,et al.  The practical implementation of artificial intelligence technologies in medicine , 2019, Nature Medicine.

[46]  Trisha Greenhalgh,et al.  Beyond Adoption: A New Framework for Theorizing and Evaluating Nonadoption, Abandonment, and Challenges to the Scale-Up, Spread, and Sustainability of Health and Care Technologies , 2017, Journal of medical Internet research.

[47]  E. Shortliffe,et al.  Clinical Decision Support in the Era of Artificial Intelligence. , 2018, JAMA.

[48]  Guanhua Chen,et al.  Calibration drift in regression and machine learning models for acute kidney injury , 2017, J. Am. Medical Informatics Assoc..

[49]  K. Borgwardt,et al.  Machine Learning in Medicine , 2015, Mach. Learn. under Resour. Constraints Vol. 3.

[50]  Lynne E. Parker,et al.  Creation of the National Artificial Intelligence Research and Development Strategic Plan , 2018, AI Mag..

[51]  J. Ioannidis,et al.  Stealth research: Lack of peer‐reviewed evidence from healthcare unicorns , 2019, European journal of clinical investigation.

[52]  M. Abràmoff,et al.  Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices , 2018, npj Digital Medicine.

[53]  Steven G. Johnson,et al.  A Harmonized Data Quality Assessment Terminology and Framework for the Secondary Use of Electronic Health Record Data , 2016, EGEMS.

[54]  Dominik Aronsky,et al.  Impact of an Electronic Clinical Decision Support Tool for Emergency Department Patients With Pneumonia. , 2015, Annals of emergency medicine.

[55]  N. Tangri,et al.  A predictive model for progression of chronic kidney disease to kidney failure. , 2011, JAMA.

[56]  Ziad Obermeyer,et al.  Lost in Thought - The Limits of the Human Mind and the Future of Medicine. , 2017, The New England journal of medicine.

[57]  Eric J Topol,et al.  High-performance medicine: the convergence of human and artificial intelligence , 2019, Nature Medicine.

[58]  Scott Levin,et al.  Machine‐Learning‐Based Electronic Triage More Accurately Differentiates Patients With Respect to Clinical Outcomes Compared With the Emergency Severity Index , 2017, Annals of emergency medicine.

[59]  P. Georgiou,et al.  A systematic review of clinical decision support systems for antimicrobial management: are we failing to investigate these interventions appropriately? , 2017, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[60]  InSook Cho,et al.  Changes in Nursing Activity After Implementing a CDS Service Predicting the Risk of Falling Based on Electronic Medical Records Data , 2019, AMIA.

[61]  Peter J. Haug,et al.  Implementation of Real-Time Electronic Clinical Decision Support for Emergency Department Patients with Pneumonia Across a Healthcare System , 2020, AMIA.

[62]  Robert C. Amland,et al.  An investigation of sepsis surveillance and emergency treatment on patient mortality outcomes: An observational cohort study , 2018, JAMIA open.

[63]  M. Howell,et al.  Ensuring Fairness in Machine Learning to Advance Health Equity , 2018, Annals of Internal Medicine.

[64]  David C. Kale,et al.  Do no harm: a roadmap for responsible machine learning for health care , 2019, Nature Medicine.

[65]  Devore S. Culver,et al.  Development, Validation and Deployment of a Real Time 30 Day Hospital Readmission Risk Assessment Tool in the Maine Healthcare Information Exchange , 2015, PloS one.

[66]  Gary S. Collins,et al.  Reporting of artificial intelligence prediction models , 2019, The Lancet.

[67]  Leo Anthony Celi,et al.  The “inconvenient truth” about AI in healthcare , 2019, npj Digital Medicine.

[68]  C. Winslow,et al.  Multicenter development and validation of a risk stratification tool for ward patients. , 2014, American journal of respiratory and critical care medicine.

[69]  D. Bates,et al.  Novel Approach to Inpatient Fall Risk Prediction and Its Cross-Site Validation Using Time-Variant Data , 2018, Journal of medical Internet research.

[70]  I. Kohane,et al.  Big Data and Machine Learning in Health Care. , 2018, JAMA.

[71]  Vincent Liu,et al.  Development and validation of an electronic medical record-based alert score for detection of inpatient deterioration outside the ICU , 2016, J. Biomed. Informatics.

[72]  Jenna Wiens,et al.  Patient Risk Stratification for Hospital-Associated C. diff as a Time-Series Classification Task , 2012, NIPS.

[73]  Matthew M Churpek,et al.  Real-Time Risk Prediction on the Wards: A Feasibility Study , 2016, Critical care medicine.

[74]  Michael J Rothman,et al.  Measuring the modified early warning score and the Rothman Index: Advantages of utilizing the electronic medical record in an early warning system , 2013, Journal of hospital medicine.

[75]  Avi Goldfarb,et al.  Artificial Intelligence and the Implementation Challenge , 2019, Journal of medical Internet research.

[76]  R. Bellomo,et al.  The Third International Consensus Definitions for Sepsis and Septic Shock (Sepsis-3). , 2016, JAMA.

[77]  Geraint Rees,et al.  Evaluation of a digitally-enabled care pathway for acute kidney injury management in hospital emergency admissions , 2019, npj Digital Medicine.

[78]  Nachman Ash,et al.  Computer-Assisted Flagging of Individuals at High Risk of Colorectal Cancer in a Large Health Maintenance Organization Using the ColonFlag Test. , 2018, JCO clinical cancer informatics.

[79]  M. Abràmoff,et al.  Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices , 2018, npj Digital Medicine.

[80]  Yaron Kinar,et al.  Development and validation of a predictive model for detection of colorectal cancer in primary care by analysis of complete blood counts: a binational retrospective study , 2016, J. Am. Medical Informatics Assoc..