Multivariate analysis of complex gene expression and clinical phenotypes with genetic marker data

This paper summarizes contributions to group 12 of the 15th Genetic Analysis Workshop. The papers in this group focused on multivariate methods and applications for the analysis of molecular data including genotypic data as well as gene expression microarray measurements and clinical phenotypes. A range of multivariate techniques have been employed to extract signals from the multi‐feature data sets that were provided by the workshop organizers. The methods included data reduction techniques such as principal component analysis and cluster analysis; latent variable models including structural equations and item response modeling; joint multivariate modeling techniques as well as multivariate visualization tools. This summary paper categorizes and discusses individual contributions with regard to multiple classifications of multivariate methods. Given the wide variety in the data considered, the objectives of the analysis and the methods applied, direct comparison of the results of the various papers is difficult. However, the group was able to make many interesting comparisons and parallels between the various approaches. In summary, there was a consensus among authors in group 12 that the genetic research community should continue to draw experiences from other fields such as statistics, econometrics, chemometrics, computer science and linear systems theory. Genet. Epidemiol. 31(Suppl. 1):S103–S109, 2007. © 2007 Wiley‐Liss, Inc.

[1]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[2]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[3]  C. Molony,et al.  Genetic analysis of genome-wide variation in human gene expression , 2004, Nature.

[4]  Kenneth A. Bollen,et al.  Structural Equations with Latent Variables , 1989 .

[5]  P. Khatri,et al.  Global functional profiling of gene expression ? ? This work was funded in part by a Sun Microsystem , 2003 .

[6]  Clustering and principal-components approach based on heritability for mapping multiple gene expressions , 2007, BMC proceedings.

[7]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[8]  Aeilko H Zwinderman,et al.  Penalized canonical correlation analysis to quantify the association between gene expression and DNA markers , 2007, BMC proceedings.

[9]  Ingrid B Borecki,et al.  Rheumatoid arthritis, item response theory, Blom transformation, and mixed models , 2007, BMC proceedings.

[10]  Na Li,et al.  Genetic Analysis Workshop 15: simulation of a complex genetic model for rheumatoid arthritis in nuclear families including a dense SNP map with linkage disequilibrium between marker loci and trait loci , 2007, BMC Proceedings.

[11]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[12]  Wei Pan,et al.  Functional group-based linkage analysis of gene expression trait loci , 2007, BMC proceedings.

[13]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[14]  Annette Lee,et al.  Data for Genetic Analysis Workshop (GAW) 15 Problem 2, genetic causes of rheumatoid arthritis and associated traits , 2007, BMC proceedings.

[15]  R. P. McDonald,et al.  Structural Equations with Latent Variables , 1989 .

[16]  Application of bivariate mixed counting process models to genetic analysis of rheumatoid arthritis severity , 2007, BMC proceedings.

[17]  David Tritchler,et al.  Genome-wide sparse canonical correlation of gene expression with genotypes , 2007, BMC proceedings.

[18]  Eden Martin,et al.  Genomic convergence: identifying candidate genes for Parkinson's disease by combining serial analysis of gene expression and genetic linkage. , 2003, Human molecular genetics.

[19]  P. Khatri,et al.  Global functional profiling of gene expression. , 2003, Genomics.

[20]  J. S. Rao,et al.  Studying genetic determinants of natural variation in human gene expression using Bayesian ANOVA , 2007, BMC proceedings.

[21]  C. Stein,et al.  Modeling the complex gene × environment interplay in the simulated rheumatoid arthritis GAW15 data using latent variable structural equation modeling , 2007, BMC proceedings.

[22]  R. Spielman,et al.  Data for Genetic Analysis Workshop (GAW) 15, Problem 1: genetics of gene expression variation in humans , 2007, BMC proceedings.