Categorize, Cluster & Classify - The 3C Strategy Applied to Alzheimer's Disease as a Case Study

Health informatics is facing many challenges these days, in analysing current medical data and especially hospital data towards understanding disease mechanisms, predicting the course of a disease or assist in targeting potential therapeutic options. Alongside the promises, many challenges emerge. Among the major ones we identify: current diagnosis criteria that are too vague to capture disease manifestation; the irrelevance of personalized medicine when only heterogeneous classes of patients are available, and how to properly process big data to avoid false claims. We offer a 3C strategy that starts from the medical knowledge, categorizing the available set of features into three types: the patients' assigned disease diagnosis, clinical measurements and potential biological markers, proceeds to an unsupervised learning process targeted to create new disease diagnosis classes, and finally, classifying the newly proposed diagnosis classes utilizing the potential biological markers. In order to allow the evaluation and comparison of different algorithmic components of the 3C strategy a simulation model was built and put to use. Our strategy, developed as part of the medical informatics work package at the EU Human Brain flagship Project strives to connect between potential biomarkers, and more homogeneous classes of disease manifestation that are expressed by meaningful features. We demonstrate this strategy using data from the Alzheimer's Disease Neuroimaging Initiative cohort (ADNI).

[1]  C. Rowe,et al.  Plasma apolipoprotein E and Alzheimer disease risk , 2011, Neurology.

[2]  T. Montine,et al.  Biomarkers for cognitive impairment and dementia in elderly people , 2008, The Lancet Neurology.

[3]  J. Blass,et al.  Volume changes in Alzheimer's disease and mild cognitive impairment: cognitive associations , 2010 .

[4]  et al.,et al.  Categorical and correlational analyses of baseline fluorodeoxyglucose positron emission tomography images from the Alzheimer's Disease Neuroimaging Initiative (ADNI) , 2009, NeuroImage.

[5]  W. Revelle psych: Procedures for Personality and Psychological Research , 2017 .

[6]  Kirsti Malterud,et al.  For Personal Use. Only Reproduce with Permission from the Lancet Publishing Group. the Nature of Clinical Knowledge the Art and Science of Clinical Knowledge: Evidence beyond Measures and Numbers Qualitative Research Series , 2022 .

[7]  Robert Tibshirani,et al.  Estimating the number of clusters in a data set via the gap statistic , 2000 .

[8]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[9]  A. Dale,et al.  Combining MR Imaging, Positron-Emission Tomography, and CSF Biomarkers in the Diagnosis and Prognosis of Alzheimer Disease , 2010, American Journal of Neuroradiology.

[10]  Andy Liaw,et al.  Classification and Regression by randomForest , 2007 .

[11]  Vikas Singh,et al.  Predictive markers for AD in a multi-modality framework: An analysis of MCI progression in the ADNI population , 2011, NeuroImage.

[12]  Mark E. Schmidt,et al.  The Alzheimer's Disease Neuroimaging Initiative: A review of papers published since its inception , 2012, Alzheimer's & Dementia.

[13]  Trey Sunderland,et al.  Decreased-Amyloid 1-42 and Increased Tau Levels in Cerebrospinal Fluid of Patients With Alzheimer Disease , 2003 .

[14]  C. Jack,et al.  Boosting power for clinical trials using classifiers based on multiple biomarkers , 2010, Neurobiology of Aging.