General framework for meta-analysis of rare variants in sequencing association studies.

We propose a general statistical framework for meta-analysis of gene- or region-based multimarker rare variant association tests in sequencing association studies. In genome-wide association studies, single-marker meta-analysis has been widely used to increase statistical power by combining results via regression coefficients and standard errors from different studies. In analysis of rare variants in sequencing studies, region-based multimarker tests are often used to increase power. We propose meta-analysis methods for commonly used gene- or region-based rare variants tests, such as burden tests and variance component tests. Because estimation of regression coefficients of individual rare variants is often unstable or not feasible, the proposed method avoids this difficulty by calculating score statistics instead that only require fitting the null model for each study and then aggregating these score statistics across studies. Our proposed meta-analysis rare variant association tests are conducted based on study-specific summary statistics, specifically score statistics for each variant and between-variant covariance-type (linkage disequilibrium) relationship statistics for each gene or region. The proposed methods are able to incorporate different levels of heterogeneity of genetic effects across studies and are applicable to meta-analysis of multiple ancestry groups. We show that the proposed methods are essentially as powerful as joint analysis by directly pooling individual level genotype data. We conduct extensive simulations to evaluate the performance of our methods by varying levels of heterogeneity across studies, and we apply the proposed methods to meta-analysis of rare variant effects in a multicohort study of the genetics of blood lipid levels.

[1]  John P A Ioannidis,et al.  Meta-analysis in genome-wide association studies. , 2009, Pharmacogenomics.

[2]  Wolfgang Viechtbauer,et al.  Conducting Meta-Analyses in R with the metafor Package , 2010 .

[3]  Meta-Analysis of Gene Level Association Tests , 2013, 1305.1318.

[4]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[5]  Eleazar Eskin,et al.  Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. , 2011, American journal of human genetics.

[6]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[7]  Xihong Lin,et al.  Hypothesis testing in semiparametric additive mixed models. , 2003, Biostatistics.

[8]  E. Suchman,et al.  The American Soldier: Adjustment During Army Life. , 1949 .

[9]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[10]  A. Morris,et al.  Transethnic Meta-Analysis of Genomewide Association Studies , 2011, Genetic epidemiology.

[11]  W. Ansorge Next-generation DNA sequencing techniques. , 2009, New biotechnology.

[12]  M. Rieder,et al.  Optimal unified approach for rare-variant association testing with application to small-sample case-control whole-exome sequencing studies. , 2012, American journal of human genetics.

[13]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.

[14]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[15]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[16]  G. Tseng,et al.  Comprehensive literature review and statistical considerations for GWAS meta-analysis , 2012, Nucleic acids research.

[17]  Yurii S. Aulchenko,et al.  The Empirical Power of Rare Variant Association Methods: Results from Sanger Sequencing in 1,998 Individuals , 2012, PLoS genetics.

[18]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[19]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[20]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[21]  Education Division Studies in social psychology in World War II , 1949 .

[22]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.

[23]  D Y Lin,et al.  Meta‐analysis of genome‐wide association studies: no efficiency gain in using individual participant data , 2009, Genetic epidemiology.

[24]  Jing Cui,et al.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci , 2010, Nature Genetics.

[25]  Kathryn Roeder,et al.  Analysis of Rare, Exonic Variation amongst Subjects with Autism Spectrum Disorders and Population Controls , 2013, PLoS genetics.

[26]  Evangelos Evangelou,et al.  Heterogeneity in Meta-Analyses of Genome-Wide Association Investigations , 2007, PloS one.

[27]  Tatiana A. Tatusova,et al.  NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy , 2011, Nucleic Acids Res..

[28]  Josyf Mychaleckyj,et al.  Robust relationship inference in genome-wide association studies , 2010, Bioinform..

[29]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[30]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[31]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[32]  S. Gabriel,et al.  Calibrating a coalescent simulation of human genome sequence variation. , 2005, Genome research.

[33]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[34]  R. Davies The distribution of a linear combination of 2 random variables , 1980 .

[35]  Lei Sun,et al.  Robust and Powerful Tests for Rare Variants Using Fisher's Method to Combine Evidence of Association From Two or More Complementary Tests , 2013, Genetic epidemiology.

[36]  S. Henikoff,et al.  Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm , 2009, Nature Protocols.

[37]  Pierre Lafaye de Micheaux,et al.  Computing the distribution of quadratic forms: Further comparisons between the Liu-Tang-Zhang approximation and exact methods , 2010, Comput. Stat. Data Anal..

[38]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[39]  Lee-Jen Wei,et al.  Pooled Association Tests for Rare Variants in Exon-Resequencing Studies , 2010 .