Methods for meta‐analysis of multiple traits using GWAS summary statistics

Genome‐wide association studies (GWAS) for complex diseases have focused primarily on single‐trait analyses for disease status and disease‐related quantitative traits. For example, GWAS on risk factors for coronary artery disease analyze genetic associations of plasma lipids such as total cholesterol, LDL‐cholesterol, HDL‐cholesterol, and triglycerides (TGs) separately. However, traits are often correlated and a joint analysis may yield increased statistical power for association over multiple univariate analyses. Recently several multivariate methods have been proposed that require individual‐level data. Here, we develop metaUSAT (where USAT is unified score‐based association test), a novel unified association test of a single genetic variant with multiple traits that uses only summary statistics from existing GWAS. Although the existing methods either perform well when most correlated traits are affected by the genetic variant in the same direction or are powerful when only a few of the correlated traits are associated, metaUSAT is designed to be robust to the association structure of correlated traits. metaUSAT does not require individual‐level data and can test genetic associations of categorical and/or continuous traits. One can also use metaUSAT to analyze a single trait over multiple studies, appropriately accounting for overlapping samples, if any. metaUSAT provides an approximate asymptotic P‐value for association and is computationally efficient for implementation at a genome‐wide level. Simulation experiments show that metaUSAT maintains proper type‐I error at low error levels. It has similar and sometimes greater power to detect association across a wide array of scenarios compared to existing methods, which are usually powerful for some specific association scenarios only. When applied to plasma lipids summary data from the METSIM and the T2D‐GENES studies, metaUSAT detected genome‐wide significant loci beyond the ones identified by univariate analyses. Evidence from larger studies suggest that the variants additionally detected by our test are, indeed, associated with lipid levels in humans. In summary, metaUSAT can provide novel insights into the genetic architecture of a common disease or traits.

[1]  Xihong Lin,et al.  Multiple phenotype association tests using summary statistics in genome‐wide association studies , 2018, Biometrics.

[2]  S. Basu,et al.  A novel association test for multiple secondary phenotypes from a case‐control GWAS , 2017, Genetic epidemiology.

[3]  Heather F. Porter,et al.  Multivariate simulation framework reveals performance of multi-trait GWAS methods , 2017, Scientific Reports.

[4]  A. Yashin,et al.  Pleiotropic Meta-Analyses of Longitudinal Studies Discover Novel Genetic Variants Associated with Age-Related Diseases , 2016, Front. Genet..

[5]  Stephen C. J. Parker,et al.  The genetic architecture of type 2 diabetes , 2016, Nature.

[6]  Baolin Wu,et al.  On Sample Size and Power Calculation for Variant Set‐Based Association Tests , 2016, Annals of human genetics.

[7]  Dana C. Crawford,et al.  Unravelling the human genome–phenome relationship using phenome-wide association studies , 2016, Nature Reviews Genetics.

[8]  Wei Pan,et al.  An Adaptive Association Test for Multiple Phenotypes with GWAS Summary Statistics , 2015, Genetic epidemiology.

[9]  H. Schwarzenbacher,et al.  A multi-trait meta-analysis with imputed sequence variants reveals twelve QTL for mammary gland morphology in Fleckvieh cattle , 2015, bioRxiv.

[10]  J. Witte,et al.  Semiparametric Allelic Tests for Mapping Multiple Phenotypes: Binomial Regression and Mahalanobis Distance , 2015, Genetic epidemiology.

[11]  Matti Pirinen,et al.  metaCCA: summary statistics-based multivariate meta-analysis of genome-wide association studies using canonical correlation analysis , 2015, bioRxiv.

[12]  Wei Pan,et al.  A Bayesian Partitioning Model for the Detection of Multilocus Effects in Case-Control Studies , 2015, Human Heredity.

[13]  Sara M. Willems,et al.  The impact of low-frequency and rare variants on lipid levels , 2015, Nature Genetics.

[14]  Xiaofeng Zhu,et al.  Meta-analysis of correlated traits via summary statistics from GWASs with an application in hypertension. , 2015, American journal of human genetics.

[15]  Saonli Basu,et al.  USAT: A Unified Score‐Based Association Test for Multiple Phenotype‐Genotype Analysis , 2014, Genetic epidemiology.

[16]  M. Goddard,et al.  A Multi-Trait, Meta-analysis for Detecting Pleiotropic Polymorphisms for Stature, Fatness and Reproduction in Beef Cattle , 2014, PLoS genetics.

[17]  M. Stephens,et al.  Efficient Algorithms for Multivariate Linear Mixed Models in Genome-wide Association Studies , 2013, Nature Methods.

[18]  S. Hebbring The challenges, advantages and future of phenome-wide association studies , 2014, Immunology.

[19]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[20]  William G. Iacono,et al.  A Rapid Gene-Based Genome-Wide Association Test with Multivariate Traits , 2013, Human Heredity.

[21]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[22]  M. Stephens A Unified Framework for Association Analysis with Multiple Related Phenotypes , 2013, PloS one.

[23]  C. Barnes,et al.  Genome-Wide Screen for Metabolic Syndrome Susceptibility Loci Reveals Strong Lipid Gene Contribution But No Evidence for Common Genetic Basis for Clustering of Metabolic Syndrome Traits , 2012, Circulation. Cardiovascular genetics.

[24]  Markus Perola,et al.  Genome-wide association study identifies multiple loci influencing human serum metabolite levels , 2012, Nature Genetics.

[25]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[26]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[27]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[28]  H. Kang,et al.  Variance component model to account for sample structure in genome-wide association studies , 2010, Nature Genetics.

[29]  Dan-Yu Lin,et al.  Meta-analysis of genome-wide association studies with overlapping subjects. , 2009, American journal of human genetics.

[30]  Wei Pan,et al.  Asymptotic tests of association with multiple SNPs in linkage disequilibrium , 2009, Genetic epidemiology.

[31]  Johanna Kuusisto,et al.  Changes in Insulin Sensitivity and Insulin Release in Relation to Glycemia and Glucose Tolerance in 6,414 Finnish Men , 2009, Diabetes.

[32]  Alberto Piazza,et al.  Genome-wide association of early-onset myocardial infarction with single nucleotide polymorphisms and copy number variants , 2009, Nature Genetics.

[33]  Huan Liu,et al.  A new chi-square approximation to the distribution of non-negative definite quadratic forms in non-central normal variables , 2009, Comput. Stat. Data Anal..

[34]  M. Boehnke,et al.  So many correlated tests, so little time! Rapid adjustment of P values for multiple correlated tests. , 2007, American journal of human genetics.

[35]  P. Diggle Analysis of Longitudinal Data , 1995 .

[36]  P. O'Brien Procedures for comparing samples with multiple endpoints. , 1984, Biometrics.

[37]  Manuel A. R. Ferreira,et al.  Genetics and population analysis A multivariate test of association , 2009 .