Selecting the most important self-assessed features for predicting conversion to mild cognitive impairment with random forest and permutation-based methods

Alzheimer’s Disease (AD) is a complex, multifactorial and comorbid condition. The asymptomatic behavior in the early stages makes the identification of the disease onset particularly challenging. Mild cognitive impairment (MCI) is an intermediary stage between the expected decline of normal aging and the pathological decline associated with dementia. The identification of risk factors for MCI is thus sorely needed. Self-reported personal information such as age, education, income level, sleep, diet, physical exercise, etc. are called to play a key role not only in the early identification of MCI but also in the design of personalized interventions and the promotion of patients empowerment. In this study we leverage on The Vallecas Project, a large longitudinal study on healthy aging in Spain, to identify the most important self-reported features for future conversion to MCI. Using machine learning (random forest) and permutation-based methods we select the set of most important self-reported variables for MCI conversion which includes among others, subjective cognitive decline, educational level, working experience, social life, and diet. Subjective cognitive decline stands as the most important feature for future conversion to MCI across different feature selection techniques.

[1]  U. Habel,et al.  Predicting Stability of Mild Cognitive Impairment (MCI): Findings of a Community Based Sample. , 2017, Current Alzheimer research.

[2]  Frank E. Harrell,et al.  Regression Modeling Strategies: With Applications to Linear Models, Logistic Regression, and Survival Analysis , 2001 .

[3]  Simon Lovestone,et al.  Is MCI really just early dementia? A systematic review of conversion studies , 2004, International Psychogeriatrics.

[4]  T. Abel,et al.  Effects of sleep deprivation and aging on long-term and remote memory in mice , 2015, Learning & memory.

[5]  Ein-Ya Gura,et al.  Insights into Game Theory: An Alternative Mathematical Experience , 2008 .

[6]  Ariel Rubinstein,et al.  A Course in Game Theory , 1995 .

[7]  E. Tangalos,et al.  Mild Cognitive Impairment Clinical Characterization and Outcome , 1999 .

[8]  B. Reisberg,et al.  Mild cognitive impairment in the elderly , 1991, Neurology.

[9]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[10]  Cynthia Rudin,et al.  Model Class Reliance: Variable Importance Measures for any Machine Learning Model Class, from the "Rashomon" Perspective , 2018 .

[11]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[12]  D. Murman,et al.  The Risk of Incident Mild Cognitive Impairment and Progression to Dementia Considering Mild Cognitive Impairment Subtypes , 2017, Dementia and Geriatric Cognitive Disorders Extra.

[13]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[14]  Meritxell Valentí Soler,et al.  Exploratory Data Analysis in a Six-Year Longitudinal Study in Healthy Brain Aging , 2019, bioRxiv.

[15]  C. Smart,et al.  Subjective Cognitive Decline in Preclinical Alzheimer's Disease. , 2017, Annual review of clinical psychology.

[16]  A. Mitchell,et al.  Risk of dementia and mild cognitive impairment in older people with subjective memory complaints: meta‐analysis , 2014, Acta psychiatrica Scandinavica.

[17]  F. Jessen,et al.  Memory Concerns, Memory Performance and Risk of Dementia in Patients with Mild Cognitive Impairment , 2014, PloS one.

[18]  F. Schmidt Meta-Analysis , 2008 .

[19]  A. Zonderman,et al.  Risk of dementia after fluctuating mild cognitive impairment , 2014, Neurology.

[20]  C. Ferri,et al.  World Alzheimer Report 2011 : The benefits of early diagnosis and intervention , 2018 .

[21]  M. Ávila-Villanueva,et al.  Subjective Cognitive Decline as a Preclinical Marker for Alzheimer's Disease: The Challenge of Stability Over Time , 2017, Front. Aging Neurosci..

[22]  C. Bielza,et al.  The Vallecas Project: A Cohort to Identify Early Markers and Mechanisms of Alzheimer’s Disease , 2015, Front. Aging Neurosci..

[23]  Olivier Salvado,et al.  Addressing population aging and Alzheimer's disease through the Australian Imaging Biomarkers and Lifestyle study: Collaboration with the Alzheimer's Disease Neuroimaging Initiative , 2010, Alzheimer's & Dementia.

[24]  Christopher. Simons,et al.  Machine learning with Python , 2017 .

[25]  Claudia Altamura,et al.  Markers for the risk of progression from mild cognitive impairment to Alzheimer's disease. , 2015, Journal of Alzheimer's disease : JAD.

[26]  L. Shapley A Value for n-person Games , 1988 .

[27]  Achim Zeileis,et al.  Bias in random forest variable importance measures: Illustrations, sources and a solution , 2007, BMC Bioinformatics.

[28]  D. Ruppert The Elements of Statistical Learning: Data Mining, Inference, and Prediction , 2004 .

[29]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[30]  Dong Young Lee,et al.  Predictive validity and diagnostic stability of mild cognitive impairment subtypes , 2012, Alzheimer's & Dementia.

[31]  Tjerk P. Straatsma,et al.  NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations , 2010, Comput. Phys. Commun..

[32]  A. Spiro,et al.  Daily stressors and memory failures in a naturalistic setting: findings from the VA Normative Aging Study. , 2006, Psychology and aging.

[33]  Leroy Hood,et al.  Systems biology, proteomics, and the future of health care: toward predictive, preventative, and personalized medicine. , 2004, Journal of proteome research.

[34]  Scott M. Lundberg,et al.  Explainable machine-learning predictions for the prevention of hypoxaemia during surgery , 2018, Nature Biomedical Engineering.

[35]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[36]  F. Maestú,et al.  Internal Consistency Over Time of Subjective Cognitive Decline: Drawing Preclinical Alzheimer's Disease Trajectories. , 2018, Journal of Alzheimer's disease : JAD.

[37]  David A. Ratkowsky,et al.  Handbook of nonlinear regression models , 1990 .

[38]  Malek Adjouadi,et al.  Utilizing semantic intrusions to identify amyloid positivity in mild cognitive impairment , 2018, Neurology.

[39]  Carmen Sandi,et al.  Stress and Memory: Behavioral Effects and Neurobiological Mechanisms , 2007, Neural plasticity.

[40]  E. Steyerberg,et al.  [Regression modeling strategies]. , 2011, Revista espanola de cardiologia.