Low-Rank Graph-Regularized Structured Sparse Regression for Identifying Genetic Biomarkers

In this paper, we propose a novel sparse regression method for Brain-Wide and Genome-Wide association study. Specifically, we impose a low-rank constraint on the weight coefficient matrix and then decompose it into two low-rank matrices, which find relationships in genetic features and in brain imaging features, respectively. We also introduce a sparse acyclic digraph with sparsity-inducing penalty to take further into account the correlations among the genetic variables, by which it can be possible to identify the representative SNPs that are highly associated with the brain imaging features. We optimize our objective function by jointly tackling low-rank regression and variable selection in a framework. In our method, the low-rank constraint allows us to conduct variable selection with the low-rank representations of the data; the learned low-sparsity weight coefficients allow discarding unimportant variables at the end. The experimental results on the Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset showed that the proposed method could select the important SNPs to more accurately estimate the brain imaging features than the state-of-the-art methods.

[1]  A. Izenman Reduced-rank regression for the multivariate linear model , 1975 .

[2]  Alan C. Evans,et al.  3D Anatomical Atlas of the Human Brain , 1998, NeuroImage.

[3]  Alan C. Evans,et al.  A nonparametric method for automatic correction of intensity nonuniformity in MRI data , 1998, IEEE Transactions on Medical Imaging.

[4]  Stephen M. Smith,et al.  Segmentation of brain MR images through a hidden Markov random field model and the expectation-maximization algorithm , 2001, IEEE Transactions on Medical Imaging.

[5]  Dinggang Shen,et al.  HAMMER: hierarchical attribute matching mechanism for elastic registration , 2002, IEEE Transactions on Medical Imaging.

[6]  Michael J. Black,et al.  A Framework for Robust Subspace Learning , 2003, International Journal of Computer Vision.

[7]  Nick C Fox,et al.  Imaging cerebral atrophy: normal ageing to Alzheimer's disease , 2004, The Lancet.

[8]  Taylor J. Maxwell,et al.  DAPK1 variants are associated with Alzheimer's disease and allele-specific expression. , 2006, Human molecular genetics.

[9]  Murray A. Jorgensen Iteratively Reweighted Least Squares , 2006 .

[10]  D. Blacker,et al.  Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database , 2007, Nature Genetics.

[11]  Nick C Fox,et al.  Letter abstract - Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's Disease , 2009 .

[12]  Shichao Zhang,et al.  Shell-neighbor method and its application in missing data imputation , 2011, Applied Intelligence.

[13]  Jieping Ye,et al.  Multi-Task Feature Learning Via Efficient l2, 1-Norm Minimization , 2009, UAI.

[14]  Thomas E. Nichols,et al.  Anatomically-distinct genetic associations of APOE ɛ4 allele load with regional cortical atrophy in Alzheimer's disease , 2009, NeuroImage.

[15]  Yong Yu,et al.  Robust Subspace Segmentation by Low-Rank Representation , 2010, ICML.

[16]  Judy H. Cho,et al.  Comparisons of multi‐marker association methods to detect association between a candidate region and disease , 2010, Genetic epidemiology.

[17]  Thomas E. Nichols,et al.  Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach , 2010, NeuroImage.

[18]  Andrew J. Saykin,et al.  Voxelwise genome-wide association study (vGWAS) , 2010, NeuroImage.

[19]  J. Haines,et al.  SORCS1 alters amyloid precursor protein processing and variants may increase Alzheimer's disease risk , 2011, Annals of neurology.

[20]  B. Franke,et al.  Association of the Alzheimer's gene SORL1 with hippocampal volume in young, healthy adults. , 2011, The American journal of psychiatry.

[21]  Xiaofeng Zhu,et al.  Missing data imputation by utilizing information within incomplete instances , 2011, J. Syst. Softw..

[22]  D. Selkoe Alzheimer's disease. , 2011, Cold Spring Harbor perspectives in biology.

[23]  Jin-Tai Yu,et al.  Association of DAPK1 genetic variations with Alzheimer's disease in Han Chinese , 2011, Brain Research.

[24]  Michael Weiner,et al.  Voxelwise gene-wide association study (vGeneWAS): Multivariate gene-based association testing in 731 elderly subjects , 2011, NeuroImage.

[25]  Daoqiang Zhang,et al.  Multimodal classification of Alzheimer's disease and mild cognitive impairment , 2011, NeuroImage.

[26]  Shannon L. Risacher,et al.  From phenotype to genotype: an association study of longitudinal phenotypic markers to Alzheimer's disease relevant SNPs , 2012, Bioinform..

[27]  O. Lopez,et al.  Beta-amyloid toxicity modifier genes and the risk of Alzheimer's disease. , 2012, American journal of neurodegenerative disease.

[28]  K. Welsh-Bohmer,et al.  The Alzheimer's associated 5′ region of the SORL1 gene cis regulates SORL1 transcripts expression , 2012, Neurobiology of Aging.

[29]  Shannon L. Risacher,et al.  Identifying quantitative trait loci via group-sparse multitask regression and feature selection: an imaging genetics study of the ADNI cohort , 2012, Bioinform..

[30]  Chengqi Zhang,et al.  Cost-sensitive classification with inadequate labeled data , 2012, Inf. Syst..

[31]  Shichao Zhang,et al.  The Journal of Systems and Software , 2012 .

[32]  Paul M. Thompson,et al.  Sparse reduced-rank regression detects genetic associations with voxel-wise longitudinal phenotypes in Alzheimer's disease , 2012, NeuroImage.

[33]  Jianhua Z. Huang,et al.  Sparse Reduced-Rank Regression for Simultaneous Dimension Reduction and Variable Selection , 2012 .

[34]  E. Niemitz Ras pathway activation in breast cancer , 2013, Nature Genetics.

[35]  W. Xu,et al.  The Genetic Variation of SORCS1 Is Associated with Late-Onset Alzheimer’s Disease in Chinese Han Population , 2013, PloS one.

[36]  P. Visser,et al.  The influence of genetic variants in SORL1 gene on the manifestation of Alzheimer's disease , 2013, Alzheimer's & Dementia.

[37]  E. Niemitz ADAM10 and Alzheimer's disease , 2013, Nature Genetics.

[38]  Thomas W. Mühleisen,et al.  Genome-wide association study identifies variants at CLU and PICALM associated with Alzheimer's disease , 2013, Nature Genetics.

[39]  Mert R. Sabuncu,et al.  Joint Modeling of Imaging and Genetics , 2013, IPMI.

[40]  Jing Li,et al.  A Sparse Structure Learning Algorithm for Gaussian Bayesian Network Identification from High-Dimensional Data , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Jason H. Moore,et al.  Genetic analysis of quantitative phenotypes in AD and MCI: imaging, cognition and biomarkers , 2013, Brain Imaging and Behavior.

[42]  R. Vidal,et al.  Sparse Subspace Clustering: Algorithm, Theory, and Applications. , 2013, IEEE transactions on pattern analysis and machine intelligence.

[43]  A Convex Sparse PCA for Feature Analysis , 2014, ArXiv.

[44]  Dinggang Shen,et al.  Knowledge-Guided Robust MRI Brain Extraction for Diverse Large-Scale Neuroimaging Studies on Humans and Non-Human Primates , 2014, PloS one.

[45]  Daoqiang Zhang,et al.  Identifying Genetic Associations with MRI-derived Measures via Tree-Guided Sparse Learning , 2014, MICCAI.

[46]  Xiaofeng Zhu,et al.  A novel matrix-similarity based loss function for joint regression and classification in AD diagnosis , 2014, NeuroImage.

[47]  Vince D. Calhoun,et al.  Sparse models for correlative and integrative analysis of imaging and genetic data , 2014, Journal of Neuroscience Methods.

[48]  Z. Hu,et al.  Common genetic variants on 1p13.2 associate with risk of autism , 2014, Molecular Psychiatry.

[49]  Zi Huang,et al.  A Sparse Embedding and Least Variance Encoding Approach to Hashing , 2014, IEEE Transactions on Image Processing.

[50]  Yi Yang,et al.  Multi-Class Active Learning by Uncertainty Sampling with Diversity Maximization , 2015, International Journal of Computer Vision.

[51]  Shuicheng Yan,et al.  Smoothed Low Rank and Sparse Matrix Recovery by Iteratively Reweighted Least Squares Minimization , 2014, IEEE Transactions on Image Processing.

[52]  E. Hol,et al.  ADAM10 gene expression in the blood cells of Alzheimer's disease patients and mild cognitive impairment subjects , 2015, Biomarkers : biochemical indicators of exposure, response, and susceptibility to chemicals.

[53]  Chengqi Zhang,et al.  Convex Sparse PCA for Unsupervised Feature Learning , 2014, ACM Trans. Knowl. Discov. Data.

[54]  Dinggang Shen,et al.  Reveal Consistent Spatial-Temporal Patterns from Dynamic Functional Connectivity for Autism Spectrum Disorder Identification , 2016, MICCAI.

[55]  Xuelong Li,et al.  Block-Row Sparse Multiview Multilabel Learning for Image Classification , 2016, IEEE Transactions on Cybernetics.

[56]  Feiping Nie,et al.  Compound Rank- $k$ Projections for Bilinear Analysis , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[57]  Dinggang Shen,et al.  Subspace Regularized Sparse Multitask Learning for Multiclass Neurodegenerative Disease Identification , 2016, IEEE Transactions on Biomedical Engineering.

[58]  Dinggang Shen,et al.  Early Diagnosis of Alzheimer's Disease by Joint Feature Selection and Classification on Temporally Structured Support Vector Machine , 2016, MICCAI.

[59]  Yi Yang,et al.  Bi-Level Semantic Representation Analysis for Multimedia Event Detection , 2017, IEEE Transactions on Cybernetics.

[60]  Xiaofeng Zhu,et al.  Graph self-representation method for unsupervised feature selection , 2017, Neurocomputing.

[61]  Shichao Zhang,et al.  Robust Joint Graph Sparse Coding for Unsupervised Spectral Feature Selection , 2017, IEEE Transactions on Neural Networks and Learning Systems.

[62]  M. Katherine,et al.  reduced rank regressionによる食事パターンはオーストラリア成人の肥満および高血圧と関連する , 2017 .

[63]  Xuelong Li,et al.  Graph PCA Hashing for Similarity Search , 2017, IEEE Transactions on Multimedia.

[64]  Xuelong Li,et al.  Learning k for kNN Classification , 2017, ACM Trans. Intell. Syst. Technol..

[65]  Dinggang Shen,et al.  A novel relational regularization feature selection method for joint regression and classification in AD diagnosis , 2017, Medical Image Anal..

[66]  Yi Yang,et al.  Semantic Pooling for Complex Event Analysis in Untrimmed Videos , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.