Alzheimer's disease diagnosis framework from incomplete multimodal data using convolutional neural networks

Alzheimer's disease (AD) is a severe irreversible neurodegenerative disease that has great sufferings on patients and eventually leads to death. Early detection of AD and its prodromal stage, mild cognitive impairment (MCI) which can be either stable (sMCI) or progressive (pMCI), is highly desirable for effective treatment planning and tailoring therapy. Recent studies recommended using multimodal data fusion of genetic (single nucleotide polymorphisms, SNPs) and neuroimaging data (magnetic resonance imaging (MRI) and positron emission tomography (PET)) to discriminate AD/MCI from normal control (NC) subjects. However, missing multimodal data in the cohort under study is inevitable. In addition, data heterogeneity between phenotypes and genotypes biomarkers makes learning capability of the models more challenging. Also, the current studies mainly focus on identifying brain disease classification and ignoring the regression task. Furthermore, they utilize multistage for predicting the brain disease progression. To address these issues, we propose a novel multimodal neuroimaging and genetic data fusion for joint classification and clinical score regression tasks using the maximum number of available samples in one unified framework using convolutional neural network (CNN). Specifically, we initially perform a technique based on linear interpolation to fill the missing features for each incomplete sample. Then, we learn the neuroimaging features from MRI, PET, and SNPs using CNN to alleviate the heterogeneity among genotype and phenotype data. Meanwhile, the high learned features from each modality are combined for jointly identifying brain diseases and predicting clinical scores. To validate the performance of the proposed method, we test our method on 805 subjects from Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset. Also, we verify the similarity between the synthetic and real data using statistical analysis. Moreover, the experimental results demonstrate that the proposed method can yield better performance in both classification and regression tasks. Specifically, our proposed method achieves accuracy of 98.22%, 93.11%, and 97.35% for NC vs. AD, NC vs. sMCI, and NC vs. pMCI, respectively. On the other hand, our method attains the lowest root mean square error and the highest correlation coefficient for different clinical scores regression tasks compared with the state-of-the-art methods.