Structured sparse CCA for brain imaging genetics via graph OSCAR

BackgroundRecently, structured sparse canonical correlation analysis (SCCA) has received increased attention in brain imaging genetics studies. It can identify bi-multivariate imaging genetic associations as well as select relevant features with desired structure information. These SCCA methods either use the fused lasso regularizer to induce the smoothness between ordered features, or use the signed pairwise difference which is dependent on the estimated sign of sample correlation. Besides, several other structured SCCA models use the group lasso or graph fused lasso to encourage group structure, but they require the structure/group information provided in advance which sometimes is not available.ResultsWe propose a new structured SCCA model, which employs the graph OSCAR (GOSCAR) regularizer to encourage those highly correlated features to have similar or equal canonical weights. Our GOSCAR based SCCA has two advantages: 1) It does not require to pre-define the sign of the sample correlation, and thus could reduce the estimation bias. 2) It could pull those highly correlated features together no matter whether they are positively or negatively correlated. We evaluate our method using both synthetic data and real data. Using the 191 ROI measurements of amyloid imaging data, and 58 genetic markers within the APOE gene, our method identifies a strong association between APOE SNP rs429358 and the amyloid burden measure in the frontal region. In addition, the estimated canonical weights present a clear pattern which is preferable for further investigation.ConclusionsOur proposed method shows better or comparable performance on the synthetic data in terms of the estimated correlations and canonical loadings. It has successfully identified an important association between an Alzheimer’s disease risk SNP rs429358 and the amyloid burden measure in the frontal region.

[1]  F. Bushman,et al.  Structure-constrained sparse canonical correlation analysis with an application to microbiome data analysis. , 2013, Biostatistics.

[2]  D. Tritchler,et al.  Sparse Canonical Correlation Analysis with Application to Genomic Data Integration , 2009, Statistical applications in genetics and molecular biology.

[3]  Shannon L. Risacher,et al.  A Novel Structure-Aware Sparse Learning Algorithm for Brain Imaging Genetics , 2014, MICCAI.

[4]  Vince D. Calhoun,et al.  Correspondence between fMRI and SNP data by group sparse canonical correlation analysis , 2014, Medical Image Anal..

[5]  Feiping Nie,et al.  Efficient and Robust Feature Selection via Joint ℓ2, 1-Norms Minimization , 2010, NIPS.

[6]  Jieping Ye,et al.  Feature grouping and selection over an undirected graph , 2012, KDD.

[7]  David M. Blei,et al.  Exploiting Covariate Similarity in Sparse Regression via the Pairwise Elastic Net , 2010, AISTATS.

[8]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[9]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[10]  Xi Chen,et al.  Structured Sparse Canonical Correlation Analysis , 2012, AISTATS.

[11]  Paul M. Thompson,et al.  Imaging genetics via sparse canonical correlation analysis , 2013, 2013 IEEE 10th International Symposium on Biomedical Imaging.

[12]  H. Bondell,et al.  Simultaneous Regression Shrinkage, Variable Selection, and Supervised Clustering of Predictors with OSCAR , 2008, Biometrics.

[13]  Michael W. Weiner,et al.  APOE and BCHE as modulators of cerebral amyloid deposition: a florbetapir PET genome-wide association study , 2013, Molecular Psychiatry.

[14]  Hongzhe Li,et al.  In Response to Comment on "Network-constrained regularization and variable selection for analysis of genomic data" , 2008, Bioinform..

[15]  Thomas E. Nichols,et al.  Discovering genetic associations with high-dimensional neuroimaging phenotypes: A sparse reduced-rank regression approach , 2010, NeuroImage.

[16]  Daniela M Witten,et al.  Extensions of Sparse Canonical Correlation Analysis with Applications to Genomic Data , 2009, Statistical applications in genetics and molecular biology.

[17]  Shannon L. Risacher,et al.  Transcriptome-guided amyloid imaging genetic analysis via a novel structured sparse learning algorithm , 2014, Bioinform..

[18]  R. Tibshirani,et al.  A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis. , 2009, Biostatistics.

[19]  Xi Chen,et al.  An Efficient Optimization Algorithm for Structured Sparse CCA, with Applications to eQTL Mapping , 2011, Statistics in Biosciences.

[20]  Jonathan E. Taylor,et al.  Interpretable whole-brain prediction analysis with GraphNet , 2013, NeuroImage.