Statistical Applications in Genetics and Molecular Biology Quantifying the Association between Gene Expressions and DNA-Markers by Penalized Canonical Correlation Analysis

Multiple changes at the DNA level are at the basis of complex diseases. Identifying the genetic networks that are influenced by these changes might help in understanding the development of these diseases. Canonical correlation analysis is used to associate gene expressions with DNA-markers and thus reveals sets of co-expressed and co-regulated genes and their associating DNA-markers. However, when the number of variables gets high, e.g. in the case of microarray studies, interpretation of these results can be difficult. By adapting the elastic net to canonical correlation analysis the number of variables reduces, and interpretation becomes easier, moreover, due to the grouping effect of the elastic net co-regulated and co-expressed genes cluster. Additionally, our adaptation works well in situations where the number of variables exceeds by far the number of subjects.

[1]  H. Hotelling Relations Between Two Sets of Variates , 1936 .

[2]  Arthur E. Hoerl,et al.  Application of ridge analysis to regression problems , 1962 .

[3]  H. Wold Path Models with Latent Variables: The NIPALS Approach , 1975 .

[4]  H. Vinod Canonical ridge and econometrics of joint production , 1976 .

[5]  Richard A. Johnson,et al.  Applied Multivariate Statistical Analysis , 1983 .

[6]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[7]  Jacob A. Wegelin,et al.  A Survey of Partial Least Squares (PLS) Methods, with Emphasis on the Two-Block Case , 2000 .

[8]  Yoshihiro Yamanishi,et al.  Extraction of correlated gene clusters from multiple genomic data by generalized kernel canonical correlation analysis , 2003, ISMB.

[9]  Luc Girard,et al.  An integrated view of copy number and allelic alterations in the cancer genome using single nucleotide polymorphism arrays. , 2004, Cancer research.

[10]  John H. Zhang,et al.  Insulin-like growth factor-I decreased etoposide-induced apoptosis in glioma cells by increasing bcl-2 expression and decreasing CPP32 activity , 2005, Neurological research.

[11]  L. Recht,et al.  High-resolution genome-wide mapping of genetic alterations in human glial brain tumors. , 2005, Cancer research.

[12]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[13]  L. Recht,et al.  MYC-Interacting Genes in Human Gliomas Gliomagenesis Pathway Maps and Three Novel Functional Network Analysis Reveals Extended , 2005 .

[14]  R. Tibshirani,et al.  Sparse Principal Component Analysis , 2006 .

[15]  C. K. Hsiao,et al.  Kernel Canonical Correlation Analysis and its Applications to Nonlinear Measures of Association and Test of Independence ∗ , 2006 .