ℓ2, 1-ℓ1 regularized nonlinear multi-task representation learning based cognitive performance prediction of Alzheimer's disease

Abstract Alzheimer’s disease (AD) has been not only a substantial financial burden to the health care system but also the emotional hardship to patients and their families. Predicting cognitive performance of subjects from their magnetic resonance imaging (MRI) measures and identifying relevant imaging biomarkers are important research topics in the study of Alzheimer’s disease. Many previous works formulate the prediction task as a linear regression problem. The most critical limitation is that they assume a linear relationship between the MRI features and the cognitive outcomes. The linear models in original MRI feature spaces can be limited by their inability to exploit the nonlinear relation between the MRI features and cognitive measure prediction tasks. To better capture the complicated but more flexible relationship between the cognitive scores and the neuroimaging measures, we propose a l 2 , 1 − l 1 norm regularized multi-kernel multi-task feature learning formulation with a joint sparsity inducing regularization. The formulation facilitates the shared kernel functions, as well as the high dimensional features in the kernel induced feature spaces simultaneously, to look for the common representation that are useful for all tasks by promoting use of few kernels and few learned features in each kernel. For optimization, we develop an alternating optimization method to effectively solve the proposed mixed norm regularized formulation. We evaluate the performance of the proposed method using the Alzheimer’s Disease Neuroimaging Initiative (ADNI) datasets and demonstrate that our proposed methods achieve not only clearly improved prediction performance for cognitive measurements with single MRI modality or multi-modalities data, but also a compact set of highly suggestive biomarkers relevant to AD.

[1]  Shannon L. Risacher,et al.  Sparse multi-task regression and feature selection to identify brain imaging predictors for memory performance , 2011, 2011 International Conference on Computer Vision.

[2]  Jianping Yin,et al.  Multiple Kernel Learning in the Primal for Multimodal Alzheimer’s Disease Classification , 2013, IEEE Journal of Biomedical and Health Informatics.

[3]  Hisashi Kashima,et al.  Simultaneous Modeling of Multiple Diseases for Mortality Prediction in Acute Hospital Care , 2015, KDD.

[4]  Daoqiang Zhang,et al.  Manifold regularized multitask feature learning for multimodality disease classification , 2015, Human brain mapping.

[5]  Stéphane Canu,et al.  $\ell_{p}-\ell_{q}$ Penalty for Sparse Linear and Sparse Multiple Kernel Multitask Learning , 2011, IEEE Transactions on Neural Networks.

[6]  Dinggang Shen,et al.  New Multi-task Learning Model to Predict Alzheimer's Disease Cognitive Assessment , 2016, MICCAI.

[7]  Anders M. Dale,et al.  An automated labeling system for subdividing the human cerebral cortex on MRI scans into gyral based regions of interest , 2006, NeuroImage.

[8]  Alexander Zien,et al.  lp-Norm Multiple Kernel Learning , 2011, J. Mach. Learn. Res..

[9]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[10]  Nick C Fox,et al.  The clinical use of structural MRI in Alzheimer disease , 2010, Nature Reviews Neurology.

[11]  Shannon L. Risacher,et al.  Sparse Bayesian multi-task learning for predicting cognitive outcomes from neuroimaging measures in Alzheimer's disease , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  D. Rueckert,et al.  Multi-Method Analysis of MRI Images in Early Diagnostics of Alzheimer's Disease , 2011, PloS one.

[13]  Dinggang Shen,et al.  Subspace Regularized Sparse Multitask Learning for Multiclass Neurodegenerative Disease Identification , 2016, IEEE Transactions on Biomedical Engineering.

[14]  Jiayu Zhou,et al.  Clustered Multi-Task Learning Via Alternating Structure Optimization , 2011, NIPS.

[15]  Jiayu Zhou,et al.  FORMULA: FactORized MUlti-task LeArning for task discovery in personalized medical models , 2015, SDM.

[16]  Henryk Wozniakowski,et al.  Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992, SIAM J. Matrix Anal. Appl..

[17]  Daoqiang Zhang,et al.  Multimodal classification of Alzheimer's disease and mild cognitive impairment , 2011, NeuroImage.

[18]  Andrew J. Saykin,et al.  Identifying the Neuroanatomical Basis of Cognitive Impairment in Alzheimer's Disease by Correlation- and Nonlinearity-Aware Sparse Bayesian Learning , 2014, IEEE Transactions on Medical Imaging.

[19]  M. Kloft,et al.  l p -Norm Multiple Kernel Learning , 2011 .

[20]  Jieping Ye,et al.  An accelerated gradient method for trace norm minimization , 2009, ICML '09.

[21]  L. Yao,et al.  Prediction of Progressive Mild Cognitive Impairment by Multi-Modal Neuroimaging Biomarkers. , 2016, Journal of Alzheimer's disease : JAD.

[22]  J. Weuve,et al.  2016 Alzheimer's disease facts and figures , 2016 .

[23]  Lawrence Carin,et al.  Multi-Task Learning for Classification with Dirichlet Process Priors , 2007, J. Mach. Learn. Res..

[24]  Daoqiang Zhang,et al.  Multi-modal multi-task learning for joint prediction of multiple regression and classification variables in Alzheimer's disease , 2012, NeuroImage.

[25]  Tony Jebara,et al.  Multitask Sparsity via Maximum Entropy Discrimination , 2011, J. Mach. Learn. Res..

[26]  Dazhe Zhao,et al.  Sparse shared structure based multi-task learning for MRI based cognitive performance prediction of Alzheimer's disease , 2017, Pattern Recognit..

[27]  A. Simmons,et al.  Combination analysis of neuropsychological tests and structural MRI measures in differentiating AD, MCI and control groups—The AddNeuroMed study , 2011, Neurobiology of Aging.

[28]  Clifford R. Jack,et al.  Predicting Clinical Scores from Magnetic Resonance Scans in Alzheimer's Disease , 2010, NeuroImage.

[29]  Hanwang Zhang,et al.  L2, p-norm and sample constraint based feature selection and classification for AD diagnosis , 2016, Neurocomputing.

[30]  Mary Mittelman,et al.  World Alzheimer Report 2012: Overcoming the Stigma of Dementia , 2012 .

[31]  Seong-Whan Lee,et al.  Subclass-based multi-task learning for Alzheimer's disease diagnosis , 2014, Front. Aging Neurosci..

[32]  Masashi Sugiyama,et al.  Multi-Task Learning via Conic Programming , 2007, NIPS.

[33]  Timothy Nicholas,et al.  Alzheimer’s Disease Neuroimaging Initiative. Disease progression model for cognitive deterioration from Alzheimer’s Disease Neuroimaging Initiative database. Alzheimers Dement , 2011 .

[34]  Li Shen,et al.  Cortical surface biomarkers for predicting cognitive outcomes using group l 2,1 norm , 2015, Neurobiology of Aging.

[35]  Bin Gu,et al.  A Robust Regularization Path Algorithm for $\nu $ -Support Vector Classification , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[36]  Kaori Ito,et al.  Disease progression model for cognitive deterioration from Alzheimer's Disease Neuroimaging Initiative database , 2011, Alzheimer's & Dementia.

[37]  R. Castellani,et al.  Alzheimer disease. , 2010, Disease-a-month : DM.

[38]  Tong Zhang,et al.  A Framework for Learning Predictive Structures from Multiple Tasks and Unlabeled Data , 2005, J. Mach. Learn. Res..

[39]  Bin Gu,et al.  Incremental Support Vector Learning for Ordinal Regression , 2015, IEEE Transactions on Neural Networks and Learning Systems.

[40]  Ethem Alpaydin,et al.  Multiple Kernel Learning Algorithms , 2011, J. Mach. Learn. Res..

[41]  Shai Ben-David,et al.  Exploiting Task Relatedness for Mulitple Task Learning , 2003, COLT.

[42]  A. Levey,et al.  Alterations in Cortical Thickness and White Matter Integrity in Mild Cognitive Impairment Measured by Whole-Brain Cortical Thickness Mapping and Diffusion Tensor Imaging , 2009, American Journal of Neuroradiology.

[43]  Yves Grandvalet,et al.  Y.: SimpleMKL , 2008 .

[44]  Dinggang Shen,et al.  Inter-modality relationship constrained multi-modality multi-task feature selection for Alzheimer's Disease and mild cognitive impairment identification , 2014, NeuroImage.

[45]  Martin Jaggi,et al.  Revisiting Frank-Wolfe: Projection-Free Sparse Convex Optimization , 2013, ICML.

[46]  D. Collins,et al.  Scoring by nonlocal image patch estimator for early detection of Alzheimer's disease☆ , 2012, NeuroImage: Clinical.

[47]  Michael J. Brammer,et al.  Bayesian multi-task learning for decoding multi-subject neuroimaging data , 2014, NeuroImage.

[48]  Jonathan Baxter,et al.  A Model of Inductive Bias Learning , 2000, J. Artif. Intell. Res..

[49]  Marcello Massimini,et al.  New Insights into Alzheimer's Disease Progression: A Combined TMS and Structural MRI Study , 2011, PloS one.

[50]  J. Kuczy,et al.  Estimating the Largest Eigenvalue by the Power and Lanczos Algorithms with a Random Start , 1992 .

[51]  Anton Schwaighofer,et al.  Learning Gaussian processes from multiple tasks , 2005, ICML.

[52]  Jiayu Zhou,et al.  Integrating low-rank and group-sparse structures for robust multi-task learning , 2011, KDD.

[53]  Massimiliano Pontil,et al.  Convex multi-task feature learning , 2008, Machine Learning.

[54]  Daniel Rueckert,et al.  Multiple instance learning for classification of dementia in brain MRI , 2014, Medical Image Anal..

[55]  Elad Hazan,et al.  Sparse Approximate Solutions to Semidefinite Programs , 2008, LATIN.

[56]  Chokri Ben Amar,et al.  Recognition of Alzheimer's disease and Mild Cognitive Impairment with multimodal image-derived biomarkers and Multiple Kernel Learning , 2017, Neurocomputing.

[57]  Jun Liu,et al.  Efficient Euclidean projections in linear time , 2009, ICML '09.

[58]  Rich Caruana,et al.  Multitask Learning , 1997, Machine Learning.

[59]  Manik Varma,et al.  On p-norm Path Following in Multiple Kernel Learning for Non-linear Feature Selection , 2014, ICML.

[60]  Julien Mairal,et al.  Optimization with Sparsity-Inducing Penalties , 2011, Found. Trends Mach. Learn..

[61]  Li Yao,et al.  Multi-modality sparse representation-based classification for Alzheimer's disease and mild cognitive impairment , 2015, Comput. Methods Programs Biomed..

[62]  Massimiliano Pontil,et al.  Multi-Task Feature Learning , 2006, NIPS.

[64]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[65]  Shannon L. Risacher,et al.  High-Order Multi-Task Feature Learning to Identify Longitudinal Phenotypic Markers for Alzheimer's Disease Progression Prediction , 2012, NIPS.

[66]  Fabio A. González,et al.  Content-based histopathology image retrieval using a kernel-based semantic annotation framework , 2011, J. Biomed. Informatics.

[67]  C. Jack,et al.  Alzheimer's Disease Neuroimaging Initiative , 2008 .

[68]  Jian Zhang,et al.  Nonlinearity-aware based dimensionality reduction and over-sampling for AD/MCI classification from MRI measures , 2017, Comput. Biol. Medicine.

[69]  Wei Li,et al.  A multi-kernel based framework for heterogeneous feature selection and over-sampling for computer-aided detection of pulmonary nodules , 2017, Pattern Recognit..

[70]  Feiping Nie,et al.  Multi-View Clustering and Feature Learning via Structured Sparsity , 2013, ICML.

[71]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[72]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[73]  Jiayu Zhou,et al.  Modeling disease progression via multi-task learning , 2013, NeuroImage.