Meta-analysis of gene-level associations for rare variants based on single-variant statistics.

Meta-analysis of genome-wide association studies (GWASs) has led to the discoveries of many common variants associated with complex human diseases. There is a growing recognition that identifying "causal" rare variants also requires large-scale meta-analysis. The fact that association tests with rare variants are performed at the gene level rather than at the variant level poses unprecedented challenges in the meta-analysis. First, different studies may adopt different gene-level tests, so the results are not compatible. Second, gene-level tests require multivariate statistics (i.e., components of the test statistic and their covariance matrix), which are difficult to obtain. To overcome these challenges, we propose to perform gene-level tests for rare variants by combining the results of single-variant analysis (i.e., p values of association tests and effect estimates) from participating studies. This simple strategy is possible because of an insight that multivariate statistics can be recovered from single-variant statistics, together with the correlation matrix of the single-variant test statistics, which can be estimated from one of the participating studies or from a publicly available database. We show both theoretically and numerically that the proposed meta-analysis approach provides accurate control of the type I error and is as powerful as joint analysis of individual participant data. This approach accommodates any disease phenotype and any study design and produces all commonly used gene-level tests. An application to the GWAS summary results of the Genetic Investigation of ANthropometric Traits (GIANT) consortium reveals rare and low-frequency variants associated with human height. The relevant software is freely available.

[1]  Manuel A. R. Ferreira,et al.  Practical aspects of imputation-driven meta-analysis of genome-wide association studies. , 2008, Human molecular genetics.

[2]  Chun Li,et al.  GWAsimulator: a rapid whole-genome simulation program , 2007, Bioinform..

[3]  Jacob A. Tennessen,et al.  Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes , 2012, Science.

[4]  Jung-Ying Tzeng,et al.  Haplotype-based association analysis via variance-components score test. , 2007, American journal of human genetics.

[5]  John P A Ioannidis,et al.  Meta-analysis in genome-wide association studies. , 2009, Pharmacogenomics.

[6]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[7]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[8]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[9]  D Y Lin,et al.  Meta‐analysis of genome‐wide association studies: no efficiency gain in using individual participant data , 2009, Genetic epidemiology.

[10]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[11]  D. Zeng,et al.  On the relative efficiency of using summary statistics versus individual-level data in meta-analysis. , 2010, Biometrika.

[12]  M. Spitz,et al.  Shifting paradigm of association studies: value of rare single-nucleotide polymorphisms. , 2008, American journal of human genetics.

[13]  K. Frazer,et al.  Common vs. rare allele hypotheses for complex diseases. , 2009, Current opinion in genetics & development.

[14]  Dan-Yu Lin,et al.  A general framework for detecting disease associations with rare variants in sequencing studies. , 2011, American journal of human genetics.

[15]  E. Zeggini,et al.  An Evaluation of Statistical Approaches to Rare Variant Analysis in Genetic Association Studies , 2009, Genetic epidemiology.

[16]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[17]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[18]  A. Folsom,et al.  The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators. , 1989, American journal of epidemiology.

[19]  H. Muller The American Journal of Human Genetics Vol . 2 No . 2 June 1950 Our Load of Mutations 1 , 2006 .

[20]  Donglin Zeng,et al.  Quantitative trait analysis in sequencing studies under trait-dependent sampling , 2013, Proceedings of the National Academy of Sciences.

[21]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[22]  Christian Gieger,et al.  Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture , 2013, Nature Genetics.

[23]  Shamil R Sunyaev,et al.  Pooled association tests for rare variants in exon-resequencing studies. , 2010, American journal of human genetics.

[24]  M. Kendall Theoretical Statistics , 1956, Nature.

[25]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.