MGAS: a powerful tool for multivariate gene-based genome-wide association analysis

Motivation: Standard genome-wide association studies, testing the association between one phenotype and a large number of single nucleotide polymorphisms (SNPs), are limited in two ways: (i) traits are often multivariate, and analysis of composite scores entails loss in statistical power and (ii) gene-based analyses may be preferred, e.g. to decrease the multiple testing problem. Results: Here we present a new method, multivariate gene-based association test by extended Simes procedure (MGAS), that allows gene-based testing of multivariate phenotypes in unrelated individuals. Through extensive simulation, we show that under most trait-generating genotype–phenotype models MGAS has superior statistical power to detect associated genes compared with gene-based analyses of univariate phenotypic composite scores (i.e. GATES, multiple regression), and multivariate analysis of variance (MANOVA). Re-analysis of metabolic data revealed 32 False Discovery Rate controlled genome-wide significant genes, and 12 regions harboring multiple genes; of these 44 regions, 30 were not reported in the original analysis. Conclusion: MGAS allows researchers to conduct their multivariate gene-based analyses efficiently, and without the loss of power that is often associated with an incorrectly specified genotype–phenotype models. Availability and implementation: MGAS is freely available in KGG v3.0 (http://statgenpro.psychiatry.hku.hk/limx/kgg/download.php). Access to the metabolic dataset can be requested at dbGaP (https://dbgap.ncbi.nlm.nih.gov/). The R-simulation code is available from http://ctglab.nl/people/sophie_van_der_sluis. Contact: mxli@hku.hk Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Danny Barash,et al.  FineStr: a web server for single-base-resolution nucleosome positioning , 2010, Bioinform..

[2]  P. O’Reilly,et al.  MultiPhen: Joint Model of Multiple Phenotypes Can Increase Discovery in GWAS , 2012, PloS one.

[3]  P. Visscher,et al.  A versatile gene-based test for genome-wide association studies. , 2010, American journal of human genetics.

[4]  M. Neale,et al.  An integrated phenomic approach to multivariate allelic association , 2010, European Journal of Human Genetics.

[5]  R. Krueger,et al.  Toward scientifically useful quantitative models of psychopathology: The importance of a comparative approach , 2010, Behavioral and Brain Sciences.

[6]  Denny Borsboom,et al.  Dimensions of Normal Personality as Networks in Search of Equilibrium: You Can't like Parties if you Don't like People , 2012 .

[7]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[8]  S Cichon,et al.  Genome-wide association of mood-incongruent psychotic bipolar disorder , 2012, Translational Psychiatry.

[9]  D. Posthuma,et al.  Phenotypic Complexity, Measurement Bias, and Poor Phenotypic Resolution Contribute to the Missing Heritability Problem in Genetic Association Studies , 2010, PloS one.

[10]  Disorder Working Group Large-scale genome-wide association analysis of bipolar disorder identifies a new susceptibility locus near ODZ4 , 2012, Nature Genetics.

[11]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[12]  William G. Iacono,et al.  A Rapid Generalized Least Squares Model for a Genome-Wide Quantitative Trait Association Analysis in Families , 2011, Human Heredity.

[13]  Hailiang Huang,et al.  Gene-Based Tests of Association , 2011, PLoS genetics.

[14]  L. Kiemeney,et al.  A Comparison of Multivariate Genome-Wide Association Methods , 2014, PloS one.

[15]  Johnny S. H. Kwan,et al.  GATES: a rapid and powerful gene-based association test using extended Simes procedure. , 2011, American journal of human genetics.

[16]  G. Abecasis,et al.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.

[17]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[18]  Gonçalo R. Abecasis,et al.  Functional Gene Group Analysis Reveals a Role of Synaptic Heterotrimeric G Proteins in Cognitive Ability , 2010, American journal of human genetics.

[19]  Conor V Dolan,et al.  Genetic Association in Multivariate Phenotypic Data: Power in Five Models , 2010, Twin Research and Human Genetics.

[20]  Georg Rasch,et al.  Probabilistic Models for Some Intelligence and Attainment Tests , 1981, The SAGE Encyclopedia of Research Design.

[21]  Manuel A. R. Ferreira,et al.  A gene-based test of association using canonical correlation analysis , 2012, Bioinform..

[22]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[23]  Yurii S. Aulchenko,et al.  ProbABEL package for genome-wide association analysis of imputed data , 2010, BMC Bioinformatics.

[24]  G. Abecasis,et al.  Genotype imputation. , 2009, Annual review of genomics and human genetics.

[25]  Nathan L. Tintle,et al.  Assessing Methods for Assigning SNPs to Genes in Gene-Based Tests of Association Using Common Variants , 2013, PloS one.

[26]  Yurii S. Aulchenko,et al.  BIOINFORMATICS APPLICATIONS NOTE doi:10.1093/bioinformatics/btm108 Genetics and population analysis GenABEL: an R library for genome-wide association analysis , 2022 .

[27]  Thomas W. Mühleisen,et al.  Meta-analysis of genome-wide association data detects a risk locus for major mood disorders on chromosome 3p21.1 , 2009, Nature Genetics.

[28]  Verena D. Schmittmann,et al.  The Small World of Psychopathology , 2011, PloS one.

[29]  C. Hoggart,et al.  Genome-wide association analysis of metabolic traits in a birth cohort from a founder population , 2008, Nature Genetics.

[30]  L. Khaodhiar,et al.  Metabolic syndrome with the atypical antipsychotics. , 2010, Current opinion in endocrinology, diabetes, and obesity.

[31]  M C O'Donovan,et al.  Functional gene group analysis identifies synaptic gene groups as risk factor for schizophrenia , 2011, Molecular Psychiatry.

[32]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[33]  William G. Iacono,et al.  A Rapid Gene-Based Genome-Wide Association Test with Multivariate Traits , 2013, Human Heredity.

[34]  Conor V. Dolan,et al.  TATES: Efficient Multivariate Genotype-Phenotype Analysis for Genome-Wide Association Studies , 2013, PLoS genetics.

[35]  H.L.J. van der Maas,et al.  A dynamical model of general intelligence: the positive manifold of intelligence by mutualism. , 2006, Psychological review.

[36]  Manuel A. R. Ferreira,et al.  Genetics and population analysis A multivariate test of association , 2009 .

[37]  Scott E. Maxwell,et al.  How the power of MANOVA can both increase and decrease as a function of the intercorrelations among the dependent variables. , 1994 .