VIMCO: variational inference for multiple correlated outcomes in genome-wide association studies

MOTIVATION In Genome-Wide Association Studies (GWAS) where multiple correlated traits have been measured on participants, a joint analysis strategy, whereby the traits are analyzed jointly, can improve statistical power over a single-trait analysis strategy. There are two questions of interest to be addressed when conducting a joint GWAS analysis with multiple traits. The first question examines whether a genetic loci is significantly associated with any of the traits being tested. The second question focuses on identifying the specific trait(s) that is associated with the genetic loci. Since existing methods primarily focus on the first question, this paper seeks to provide a complementary method that addresses the second question. RESULTS We propose a novel method, Variational Inference for Multiple Correlated Outcomes (VIMCO), that focuses on identifying the specific trait that is associated with the genetic loci, when performing a joint GWAS analysis of multiple traits, while accounting for correlation among the multiple traits. We performed extensive numerical studies and also applied VIMCO to analyze two datasets. The numerical studies and real data analysis demonstrate that VIMCO improves statistical power over single-trait analysis strategies when the multiple traits are correlated and has comparable performance when the traits are not correlated. AVAILABILITY AND IMPLEMENTATION The VIMCO software can be downloaded from: https://github.com/XingjieShi/VIMCO. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  Jin Liu,et al.  Analyzing Association Mapping in Pedigree‐Based GWAS Using a Penalized Multitrait Mixed Model , 2016, Genetic epidemiology.

[2]  Thomas Meitinger,et al.  Nine loci for ocular axial length identified through genome-wide association studies, including shared loci with refractive error. , 2013, American journal of human genetics.

[3]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[4]  C. Hoggart,et al.  Genome-wide association analysis of metabolic traits in a birth cohort from a founder population , 2008, Nature Genetics.

[5]  M. Stephens,et al.  Scalable Variational Inference for Bayesian Variable Selection in Regression, and Its Accuracy in Genetic Association Studies , 2012 .

[6]  P. Visscher,et al.  GCTA: a tool for genome-wide complex trait analysis. , 2011, American journal of human genetics.

[7]  Eric P. Xing,et al.  A multivariate regression approach to association analysis of a quantitative trait network , 2008, Bioinform..

[8]  Christine B. Peterson,et al.  Controlling the Rate of GWAS False Discoveries , 2016, Genetics.

[9]  P. Mitchell,et al.  Methodology of the Singapore Indian Chinese Cohort (SICC) Eye Study: Quantifying ethnic variations in the epidemiology of eye diseases in Asians , 2009, Ophthalmic epidemiology.

[10]  Francesco Paolo Casale,et al.  Multivariate linear mixed models for statistical genetics , 2016 .

[11]  David M. Blei,et al.  Variational Inference: A Review for Statisticians , 2016, ArXiv.

[12]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[13]  Adam J Rothman,et al.  Sparse Multivariate Regression With Covariance Estimation , 2010, Journal of computational and graphical statistics : a joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America.

[14]  W James Gauderman,et al.  Genome-wide association study identifies WNT7B as a novel locus for central corneal thickness in Latinos. , 2016, Human molecular genetics.

[15]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[16]  Bjarni J. Vilhjálmsson,et al.  A mixed-model approach for genome-wide association studies of correlated traits in structured populations , 2012, Nature Genetics.

[17]  M. Stephens,et al.  Genome-wide Efficient Mixed Model Analysis for Association Studies , 2012, Nature Genetics.

[18]  W James Gauderman,et al.  Genome-wide association study identifies WNT7B as a novel locus for central corneal thickness in Latinos. , 2016, Human molecular genetics.

[19]  Deepayan Sarkar,et al.  Detecting differential gene expression with a semiparametric hierarchical mixture method. , 2004, Biostatistics.

[20]  Benjamin A. Logsdon,et al.  A variational Bayes algorithm for fast and accurate multiple locus genome-wide association analysis , 2010, BMC Bioinformatics.