Regularized Multi-source Matrix Factorization for Diagnosis of Alzheimer's Disease

In many real-world systems with multiple sources of data, data are often missing in a block-wise way. For example, in the diagnosis of Alzheimer’s disease, doctors may collect patients data from MRI images, PET images and CSF tests, while some patients may have done the MRI scan and the PET scan only, while other patients may have done the MRI scan and the CSF test only. Despite various data imputation technologies exist, in general, they neglect the correlation among multi-sources of data and thus may lead to sub-optimal performances. In this paper, we propose a model called regularized multi-source matrix factorization (RMSMF) to alleviate this problem. Specifically, to model the correlation among data sources, RMSMF firstly uses non-negative matrix factorization to factorize the observed multi-source data into the product of subject factors and feature factors. In this process, we assume different subjects from the same data source share the same feature factors. Furthermore, similarity constraints are forced on different subject factors by assuming for the same subject, the subject factors are similar among all sources. Moreover, self-paced learning with soft weighting strategy is applied to reduce the negative influence of noise data and to further enhance the performance of RMSMF. We apply our model on the diagnosis of the Alzheimer’s disease. Experimental results on the ADNI data set have demonstrated its effectiveness.

[1]  R. Petersen,et al.  Mild cognitive impairment , 2006, The Lancet.

[2]  Y. Nesterov Gradient methods for minimizing composite objective function , 2007 .

[3]  Paul M. Thompson,et al.  Multi-source learning with block-wise missing data for Alzheimer's disease prediction , 2013, KDD.

[4]  M. Yuan,et al.  Model selection and estimation in regression with grouped variables , 2006 .

[5]  M. Albert,et al.  MRI measures of entorhinal cortex vs hippocampus in preclinical AD , 2002, Neurology.

[6]  Paul M. Thompson,et al.  Multi-source feature learning for joint analysis of incomplete multiple heterogeneous neuroimaging data , 2012, NeuroImage.

[7]  A. Laub,et al.  The singular value decomposition: Its computation and some applications , 1980 .

[8]  Zenglin Xu,et al.  Robust Softmax Regression for Multi-class Classification with Self-Paced Learning , 2017, IJCAI.

[9]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[10]  Zenglin Xu,et al.  Sparse Bayesian Multiview Learning for Simultaneous Association Discovery and Diagnosis of Alzheimer's Disease , 2015, AAAI.

[11]  Zenglin Xu,et al.  Association Discovery and Diagnosis of Alzheimer's Disease with Bayesian Multiview Learning , 2016, J. Artif. Intell. Res..

[12]  Deyu Meng,et al.  Easy Samples First: Self-paced Reranking for Zero-Example Multimedia Search , 2014, ACM Multimedia.

[13]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[14]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[15]  R. Killiany,et al.  Use of structural magnetic resonance imaging to predict who will get Alzheimer's disease , 2000, Annals of neurology.

[16]  Zenglin Xu,et al.  Joint Association Discovery and Diagnosis of Alzheimer's Disease by Supervised Heterogeneous Multiview Learning , 2013, Pacific Symposium on Biocomputing.

[17]  Zenglin Xu,et al.  Balanced self-paced learning with feature corruption , 2017, 2017 International Joint Conference on Neural Networks (IJCNN).

[18]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[19]  Daphne Koller,et al.  Self-Paced Learning for Latent Variable Models , 2010, NIPS.

[20]  Michael W. Berry,et al.  Algorithms and applications for approximate nonnegative matrix factorization , 2007, Comput. Stat. Data Anal..

[21]  H. Sebastian Seung,et al.  Learning the parts of objects by non-negative matrix factorization , 1999, Nature.