Influence of medical domain knowledge on deep learning for Alzheimer's disease prediction

BACKGROUND AND OBJECTIVE Alzheimer's disease (AD) is the most common type of dementia that can seriously affect a person's ability to perform daily activities. Estimates indicate that AD may rank third as a cause of death for older people, after heart disease and cancer. Identification of individuals at risk for developing AD is imperative for testing therapeutic interventions. The objective of the study was to determine could diagnostics of AD from EMR data alone (without relying on diagnostic imaging) be significantly improved by applying clinical domain knowledge in data preprocessing and positive dataset selection rather than setting naïve filters. METHODS : Data were extracted from the repository of heterogeneous ambulatory EMR data, collected from primary care medical offices all over the U.S. Medical domain knowledge was applied to build a positive dataset from data relevant to AD. Selected Clinically Relevant Positive (SCRP) datasets were used as inputs to a Long-Short-Term Memory (LSTM) Recurrent Neural Network (RNN) deep learning model to predict will the patient develop AD. RESULTS : Risk scores prediction of AD using the drugs domain information in an SCRP AD dataset of 2,324 patients achieved high out-of-sample score - 0.98-0.99 Area Under the Precision-Recall Curve (AUPRC) when using 90% of SCRP dataset for training. AUPRC dropped to 0.89 when training the model using less than 1,500 cases from the SCRP dataset. The model was still significantly better than when using naïve dataset selection. CONCLUSION : The LSTM RNN method that used data relevant to AD performed significantly better when learning from the SCRP dataset than when datasets were selected naïvely. The integration of qualitative medical knowledge for dataset selection and deep learning technology provided a mechanism for significant improvement of AD prediction. Accurate and early prediction of AD is significant in the identification of patients for clinical trials, which can possibly result in the discovery of new drugs for treatments of AD. Also, the contribution of the proposed predictions of AD is a better selection of patients who need imaging diagnostics for differential diagnosis of AD from other degenerative brain disorders.

[1]  Richard Hoile,et al.  Identifying undetected dementia in UK primary care patients: a retrospective case-control study comparing machine-learning and standard epidemiological approaches , 2019, BMC Medical Informatics and Decision Making.

[2]  David S Knopman,et al.  Classification and epidemiology of MCI. , 2013, Clinics in geriatric medicine.

[3]  Jack Albright,et al.  Forecasting the progression of Alzheimer's disease using neural networks and a novel preprocessing algorithm , 2019, Alzheimer's & dementia.

[4]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[5]  Xudong Huang,et al.  Machine Learning-based Virtual Screening and Its Applications to Alzheimer’s Drug Discovery: A Review , 2018, Current pharmaceutical design.

[6]  Á. Ruibal,et al.  Prediction of Alzheimer's disease dementia with MRI beyond the short-term: Implications for the design of predictive models , 2019, NeuroImage: Clinical.

[7]  Jure Leskovec,et al.  Finding progression stages in time-evolving event sequences , 2014, WWW.

[8]  Youakim Badr,et al.  Predictive modeling of the severity/progression of alzheimer's diseases , 2017, 2017 International Conference on Grey Systems and Intelligent Services (GSIS).

[9]  P J Moore,et al.  Using path signatures to predict a diagnosis of Alzheimer’s disease , 2018, PloS one.

[10]  P J Moore,et al.  Random forest prediction of Alzheimer’s disease using pairwise selection from time series data , 2018, PloS one.

[11]  Dinggang Shen,et al.  A Novel Deep Learning Framework on Brain Functional Networks for Early MCI Diagnosis , 2018, MICCAI.

[12]  Martin Pavlovski,et al.  Predicting complications of diabetes mellitus using advanced machine learning algorithms , 2020, J. Am. Medical Informatics Assoc..

[13]  Pietro Liò,et al.  A parameter-efficient deep learning approach to predict conversion from mild cognitive impairment to Alzheimer's disease , 2018, NeuroImage.

[14]  Diego Castillo-Barnes,et al.  Studying the Manifold Structure of Alzheimer's Disease: A Deep Learning Approach Using Convolutional Autoencoders , 2020, IEEE Journal of Biomedical and Health Informatics.

[15]  Jiayu Zhou,et al.  Predictive modeling in urgent care: a comparative study of machine learning approaches , 2018, JAMIA open.

[16]  Malek Adjouadi,et al.  Predictive Modeling of Longitudinal Data for Alzheimer's Disease Diagnosis Using RNNs , 2018, PRIME@MICCAI.

[17]  Vince D. Calhoun,et al.  Single subject prediction of brain disorders in neuroimaging: Promises and pitfalls , 2017, NeuroImage.

[18]  John P. A. Ioannidis,et al.  Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review , 2017, J. Am. Medical Informatics Assoc..

[19]  Heikki Huttunen,et al.  Machine learning framework for early MRI-based Alzheimer's conversion prediction in MCI subjects , 2015, NeuroImage.

[20]  G. Gazelle,et al.  Cost-effectiveness of PET in the diagnosis of Alzheimer disease. , 2003, Radiology.

[21]  Yaozong Gao,et al.  Longitudinal clinical score prediction in Alzheimer's disease with soft-split sparse regression based random forest , 2016, Neurobiology of Aging.

[22]  Eman N. Marzban,et al.  Alzheimer’s disease diagnosis from diffusion tensor images using convolutional neural networks , 2020, PloS one.

[23]  Alex Sherstinsky,et al.  Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network , 2018, Physica D: Nonlinear Phenomena.

[24]  Panagiotis Papapetrou,et al.  Learning from heterogeneous temporal data in electronic health records , 2017, J. Biomed. Informatics.

[25]  Tingyan Wang,et al.  Predictive Modeling of the Progression of Alzheimer’s Disease with Recurrent Neural Networks , 2018, Scientific Reports.

[26]  Kilian M. Pohl,et al.  End-To-End Alzheimer's Disease Diagnosis and Biomarker Identification , 2018, MLMI@MICCAI.

[27]  S. Santi,et al.  Early detection of Alzheimer’s disease using neuroimaging , 2007, Experimental Gerontology.