Personalized long-term prediction of cognitive function: Using sequential assessments to improve model performance

Prediction of onset and progression of cognitive decline and dementia is important both for understanding the underlying disease processes and for planning health care for populations at risk. Predictors identified in research studies are typically accessed at one point in time. In this manuscript, we argue that an accurate model for predicting cognitive status over relatively long periods requires inclusion of time-varying components that are sequentially assessed at multiple time points (e.g., in multiple follow-up visits). We developed a pilot model to test the feasibility of using either estimated or observed risk factors to predict cognitive status. We developed two models, the first using a sequential estimation of risk factors originally obtained from 8 years prior, then improved by optimization. This model can predict how cognition will change over relatively long time periods. The second model uses observed rather than estimated time-varying risk factors and, as expected, results in better prediction. This model can predict when newly observed data are acquired in a follow-up visit. Performances of both models that are evaluated in10-fold cross-validation and various patient subgroups show supporting evidence for these pilot models. Each model consists of multiple base prediction units (BPUs), which were trained using the same set of data. The difference in usage and function between the two models is the source of input data: either estimated or observed data. In the next step of model refinement, we plan to integrate the two types of data together to flexibly predict dementia status and changes over time, when some time-varying predictors are measured only once and others are measured repeatedly. Computationally, both data provide upper and lower bounds for predictive performance.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Mauro Brunato,et al.  Reactive Search and Intelligent Optimization , 2008 .

[3]  Daniel Rueckert,et al.  Random forest-based similarity measures for multi-modal classification of Alzheimer's disease , 2013, NeuroImage.

[4]  K Yaffe,et al.  Predicting risk of dementia in older adults , 2009, Neurology.

[5]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[6]  Graham Kendall,et al.  Search Methodologies: Introductory Tutorials in Optimization and Decision Support Techniques , 2013 .

[7]  Jiayu Zhou,et al.  Modeling disease progression via multi-task learning , 2013, NeuroImage.

[8]  K. Walters,et al.  Predicting dementia risk in primary care: development and validation of the Dementia Risk Score using routinely collected data , 2016, BMC Medicine.

[9]  A. Monsch,et al.  A Novel Study Paradigm for Long-term Prevention Trials in Alzheimer Disease: The Placebo Group Simulation Approach (PGSA): Application to MCI data from the NACC database. , 2014, The journal of prevention of Alzheimer's disease.

[10]  Nicolas Cherbuin,et al.  A Self-Report Risk Index to Predict Occurrence of Dementia in Three Independent Cohorts of Older Adults: The ANU-ADRI , 2014, PloS one.

[11]  G. Marsaglia,et al.  Evaluating Kolmogorov's distribution , 2003 .

[12]  Oscar L. Lopez,et al.  Developing a national strategy to prevent dementia: Leon Thal Symposium 2009 , 2010, Alzheimer's & Dementia.

[13]  Oscar L. Lopez,et al.  Commentary on “Developing a national strategy to prevent dementia: Leon Thal Symposium 2009.” Dementia risk indices: A framework for identifying individuals with a high dementia risk , 2010, Alzheimer's & Dementia.

[14]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[15]  Lewis H Kuller,et al.  Physical activity, APOE genotype, and dementia risk: findings from the Cardiovascular Health Cognition Study. , 2005, American journal of epidemiology.

[16]  Francisco Herrera,et al.  Data Preprocessing in Data Mining , 2014, Intelligent Systems Reference Library.