The myth of generalisability in clinical research and machine learning in health care

[1]  Robert D Truog,et al.  The Toughest Triage - Allocating Ventilators in a Pandemic. , 2020, The New England journal of medicine.

[2]  Aldo A. Faisal,et al.  The Artificial Intelligence Clinician learns optimal treatment strategies for sepsis in intensive care , 2018, Nature Medicine.

[3]  Anis Sharafoddini,et al.  A New Insight Into Missing Data in Intensive Care Unit Patient Profiles: Observational Study , 2018, JMIR medical informatics.

[4]  Rajiv Raman,et al.  Performance of a Deep-Learning Algorithm vs Manual Grading for Detecting Diabetic Retinopathy in India. , 2019, JAMA ophthalmology.

[5]  Richard Beasley,et al.  External validity of randomised controlled trials in asthma: to whom do the results of the trials apply? , 2006, Thorax.

[6]  G. Collins,et al.  Early warning scores for detecting deterioration in adult hospital patients: systematic review and critical appraisal of methodology , 2020, BMJ.

[7]  J. Hampton Evidence-Based Medicine, Opinion-Based Medicine, and Real-World Medicine , 2002, Perspectives in biology and medicine.

[8]  L. Gluud Bias in clinical intervention research. , 2006, American journal of epidemiology.

[9]  Douglas G Altman,et al.  Systematic reviews in health care: Assessing the quality of controlled clinical trials. , 2001, BMJ.

[10]  Jenna Wiens,et al.  A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions , 2014, J. Am. Medical Informatics Assoc..

[11]  Constantin F. Aliferis,et al.  An evaluation of machine-learning methods for predicting pneumonia mortality , 1997, Artif. Intell. Medicine.

[12]  R. Randell,et al.  Strengths and limitations of early warning scores: A systematic review and narrative synthesis. , 2017, International journal of nursing studies.

[13]  Richard D Riley,et al.  External validation of clinical prediction models using big datasets from e-health records or IPD meta-analysis: opportunities and challenges , 2016, BMJ.

[14]  Wei Luo,et al.  Guidelines for Developing and Reporting Machine Learning Predictive Models in Biomedical Research: A Multidisciplinary View , 2016, Journal of medical Internet research.

[15]  Stefan Wermter,et al.  Continual Lifelong Learning with Neural Networks: A Review , 2019, Neural Networks.

[16]  R. Califf,et al.  Real-World Evidence - What Is It and What Can It Tell Us? , 2016, The New England journal of medicine.

[17]  J. Wedzicha,et al.  Development and Reporting of Prediction Models: Guidance for Authors From Editors of Respiratory, Sleep, and Critical Care Journals , 2020, Critical care medicine.

[18]  D. Timmerman,et al.  Untapped potential of multicenter studies: a review of cardiovascular risk prediction models revealed inappropriate analyses and wide variation in reporting , 2019, Diagnostic and Prognostic Research.

[19]  "Yes, but will it work for my patients?" Driving clinically relevant research with benchmark datasets. , 2020, NPJ digital medicine.

[20]  L. Celi,et al.  “Yes, but will it work for my patients?” Driving clinically relevant research with benchmark datasets , 2020, npj Digital Medicine.

[21]  Jesse A. Berlin,et al.  Assessing the Generalizability of Prognostic Information , 1999 .

[22]  Kathryn J Fowler,et al.  Assessing Radiology Research on Artificial Intelligence: A Brief Guide for Authors, Reviewers, and Readers-From the Radiology Editorial Board. , 2019, Radiology.

[23]  P. Rothwell,et al.  External validity of randomised controlled trials: “To whom do the results of this trial apply?” , 2005, The Lancet.

[24]  P. Rothwell,et al.  Factors That Can Affect the External Validity of Randomised Controlled Trials , 2006, PLoS clinical trials.

[25]  J. Ioannidis,et al.  Artificial intelligence versus clinicians: systematic review of design, reporting standards, and claims of deep learning studies , 2020, BMJ.

[26]  D G Altman,et al.  What do we mean by validating a prognostic model? , 2000, Statistics in medicine.

[27]  Rebecca C. Steorts,et al.  Minimal Impact of Implemented Early Warning Score and Best Practice Alert for Patient Deterioration* , 2019, Critical care medicine.

[28]  Aaron Y. Lee,et al.  Clinical applications of continual learning machine learning. , 2020, The Lancet. Digital health.

[29]  James T. Kwok,et al.  Generalizing from a Few Examples , 2019, ACM Comput. Surv..

[30]  I. Kohane,et al.  Biases in electronic health record data due to processes within the healthcare system: retrospective observational study , 2018, British Medical Journal.

[31]  John Adams Assessing , 2020, Transport Planning.

[32]  Anna Goldenberg,et al.  Feature Robustness in Non-stationary Health Records: Caveats to Deployable Model Performance in Common Clinical Machine Learning Tasks , 2019, MLHC.

[33]  Guanhua Chen,et al.  Calibration drift in regression and machine learning models for acute kidney injury , 2017, J. Am. Medical Informatics Assoc..

[34]  Richard D Riley,et al.  Prediction models for diagnosis and prognosis of covid-19 infection: systematic review and critical appraisal , 2020 .

[35]  Marcus A. Badgeley,et al.  Confounding variables can degrade generalization performance of radiological deep learning models , 2018, ArXiv.