Differences in cohort study data affect external validation of artificial intelligence models for predictive diagnostics of dementia - lessons for translation into clinical practice

Artificial intelligence (AI) approaches pose a great opportunity for individualized, pre-symptomatic disease diagnosis which plays a key role in the context of personalized, predictive, and finally preventive medicine (PPPM). However, to translate PPPM into clinical practice, it is of utmost importance that AI-based models are carefully validated. The validation process comprises several steps, one of which is testing the model on patient-level data from an independent clinical cohort study. However, recruitment criteria can bias statistical analysis of cohort study data and impede model application beyond the training data. To evaluate whether and how data from independent clinical cohort studies differ from each other, this study systematically compares the datasets collected from two major dementia cohorts, namely, the Alzheimer’s Disease Neuroimaging Initiative (ADNI) and AddNeuroMed. The presented comparison was conducted on individual feature level and revealed significant differences among both cohorts. Such systematic deviations can potentially hamper the generalizability of results which were based on a single cohort dataset. Despite identified differences, validation of a previously published, ADNI trained model for prediction of personalized dementia risk scores on 244 AddNeuroMed subjects was successful: External validation resulted in a high prediction performance of above 80% area under receiver operator characteristic curve up to 6 years before dementia diagnosis. Propensity score matching identified a subset of patients from AddNeuroMed, which showed significantly smaller demographic differences to ADNI. For these patients, an even higher prediction performance was achieved, which demonstrates the influence systematic differences between cohorts can have on validation results. In conclusion, this study exposes challenges in external validation of AI models on cohort study data and is one of the rare cases in the neurology field in which such external validation was performed. The presented model represents a proof of concept that reliable models for personalized predictive diagnostics are feasible, which, in turn, could lead to adequate disease prevention and hereby enable the PPPM paradigm in the dementia field.

[1]  Holger Fröhlich,et al.  From hype to reality: data science enabling personalized medicine , 2018, BMC Medicine.

[2]  Rostyslav V Bubnov,et al.  Medicine in the early twenty-first century: paradigm and anticipation - EPMA position paper 2016 , 2016, EPMA Journal.

[3]  C. Jack,et al.  Ways toward an early diagnosis in Alzheimer’s disease: The Alzheimer’s Disease Neuroimaging Initiative (ADNI) , 2005, Alzheimer's & Dementia.

[4]  Johann de Jong,et al.  Deep learning for clustering of multivariate clinical patient trajectories with missing values , 2019, GigaScience.

[5]  Bogdan Obrzut,et al.  Prediction of 5–year overall survival in cervical cancer patients treated with radical hysterectomy using computational intelligence methods , 2017, BMC Cancer.

[6]  Dev Mehta,et al.  Why do trials for Alzheimer’s disease drugs keep failing? A discontinued drug perspective for 2010-2015 , 2017, Expert opinion on investigational drugs.

[7]  Ji Hwan Park,et al.  Electronic Health Records Based Prediction of Future Incidence of Alzheimer’s Disease Using Machine Learning , 2019, bioRxiv.

[8]  Patrick Blake,et al.  Clinical decision support systems for improving diagnostic accuracy and achieving precision medicine , 2015, Journal of Clinical Bioinformatics.

[9]  Sterling C. Johnson,et al.  Predicting Alzheimer’s disease progression using multi-modal deep learning approach , 2019, Scientific Reports.

[10]  Gary King,et al.  MatchIt: Nonparametric Preprocessing for Parametric Causal Inference , 2011 .

[11]  M. Folstein,et al.  Clinical diagnosis of Alzheimer's disease , 1984, Neurology.

[12]  T. Helms,et al.  Artificial intelligence supported patient self-care in chronic heart failure: a paradigm shift from reactive to predictive, preventive and personalised care , 2019, EPMA Journal.

[13]  Kristine Yaffe,et al.  Potential for primary prevention of Alzheimer's disease: an analysis of population-based data , 2014, The Lancet Neurology.

[14]  Jeremy A Rassen,et al.  Matching by Propensity Score in Cohort Studies with Three Treatment Groups , 2013, Epidemiology.

[15]  F. Harrell,et al.  Evaluating the yield of medical tests. , 1982, JAMA.

[16]  M. Prince,et al.  The Global Impact of Dementia 2013-2050 , 2013 .

[17]  Clifford R Jack,et al.  Comparison of imaging biomarkers in the Alzheimer Disease Neuroimaging Initiative and the Mayo Clinic Study of Aging. , 2012, Archives of neurology.

[18]  Clifford R Jack,et al.  Testing the Right Target and Right Drug at the Right Stage , 2011, Science Translational Medicine.

[19]  Ranjan Duara,et al.  A clinically-translatable machine learning algorithm for the prediction of Alzheimer’s disease conversion: further evidence of its accuracy via a transfer learning approach , 2018, International Psychogeriatrics.

[20]  Rosa Gini,et al.  Dementia prevalence and incidence in a federation of European Electronic Health Record databases: The European Medical Informatics Framework resource , 2018, Alzheimer's & Dementia.

[21]  Lee T. Sam,et al.  Personalized Oncology Through Integrative High-Throughput Sequencing: A Pilot Study , 2011, Science Translational Medicine.

[22]  Jonathan R. Walsh,et al.  Machine learning for comprehensive forecasting of Alzheimer’s Disease progression , 2018, Scientific Reports.

[23]  D. Rubin,et al.  The central role of the propensity score in observational studies for causal effects , 1983 .

[24]  O. Golubnitschaja Neurodegeneration: accelerated ageing or inadequate healthcare? , 2010, The EPMA Journal.

[25]  J. Friedman Stochastic gradient boosting , 2002 .

[26]  Yiming Ding,et al.  A Deep Learning Model to Predict a Diagnosis of Alzheimer Disease by Using 18F-FDG PET of the Brain. , 2019, Radiology.

[27]  Donald B. Rubin,et al.  The Computerized Construction of a Matched Sample , 1970, American Journal of Sociology.

[28]  Winfried März,et al.  A new non-invasive diagnostic tool in coronary artery disease: artificial intelligence as an essential element of predictive, preventive, and personalized medicine , 2018, EPMA Journal.

[29]  G. Casadesus,et al.  Memantine for the Treatment of Dementia: A Review on its Current and Future Applications , 2017, Journal of Alzheimer's disease : JAD.

[30]  Mohammad Asif Emon,et al.  Using Multi-Scale Genetic, Neuroimaging and Clinical Data for Predicting Alzheimer’s Disease and Reconstruction of Relevant Biological Mechanisms , 2018, Scientific Reports.

[31]  S. Mandel Neurodegenerative Diseases: Integrative PPPM Approach as the Medicine of the Future , 2013 .

[32]  M. Folstein,et al.  Clinical diagnosis of Alzheimer's disease: Report of the NINCDS—ADRDA Work Group under the auspices of Department of Health and Human Services Task Force on Alzheimer's Disease , 2011, Neurology.

[33]  John Bond,et al.  The worldwide economic impact of dementia 2010 , 2013, Alzheimer's & Dementia.

[34]  Ji Hwan Park,et al.  Machine learning prediction of incidence of Alzheimer's disease using large-scale administrative health data. , 2019, NPJ digital medicine.

[35]  M. Blettner,et al.  Propensity Score: an Alternative Method of Analyzing Treatment Effects. , 2016, Deutsches Arzteblatt international.

[36]  Melissa A. Basford,et al.  Systematic comparison of phenome-wide association study of electronic medical record data and genome-wide association study data , 2013, Nature Biotechnology.

[37]  Jong Hun Kim,et al.  Electronic Health Records Based Prediction of Future Incidence of Alzheimer’s Disease Using Machine Learning , 2019, bioRxiv.

[38]  for the Alzheimer’s Disease Neuroimaging Initiative Predicting Alzheimer’s disease progression using multi-modal deep learning approach , 2019 .

[39]  S Lovestone,et al.  Biomarkers for disease modification trials--the innovative medicines initiative and AddNeuroMed. , 2007, The journal of nutrition, health & aging.

[40]  Christoforos Hadjichrysanthou,et al.  A Systematic Review of Longitudinal Studies Which Measure Alzheimer’s Disease Biomarkers , 2017, Journal of Alzheimer's disease : JAD.

[41]  A. Simmons,et al.  AddNeuroMed—The European Collaboration for the Discovery of Novel Biomarkers for Alzheimer's Disease , 2009, Annals of the New York Academy of Sciences.

[42]  Magda Tsolaki,et al.  The interactive effect of demographic and clinical factors on hippocampal volume: A multicohort study on 1958 cognitively normal individuals , 2017, Hippocampus.

[43]  Quincy M. Samus,et al.  Dementia prevention, intervention, and care , 2017, The Lancet.

[44]  Grigore Vasile Herman,et al.  Dynamics of Forest Fragmentation and Connectivity Using Particle and Fractal Analysis , 2019, Scientific Reports.