Quick, “Imputation-free” meta-analysis with proxy-SNPs

BackgroundMeta-analysis (MA) is widely used to pool genome-wide association studies (GWASes) in order to a) increase the power to detect strong or weak genotype effects or b) as a result verification method. As a consequence of differing SNP panels among genotyping chips, imputation is the method of choice within GWAS consortia to avoid losing too many SNPs in a MA. YAMAS (Yet Another Meta Analysis Software), however, enables cross-GWAS conclusions prior to finished and polished imputation runs, which eventually are time-consuming.ResultsHere we present a fast method to avoid forfeiting SNPs present in only a subset of studies, without relying on imputation. This is accomplished by using reference linkage disequilibrium data from 1,000 Genomes/HapMap projects to find proxy-SNPs together with in-phase alleles for SNPs missing in at least one study. MA is conducted by combining association effect estimates of a SNP and those of its proxy-SNPs. Our algorithm is implemented in the MA software YAMAS. Association results from GWAS analysis applications can be used as input files for MA, tremendously speeding up MA compared to the conventional imputation approach. We show that our proxy algorithm is well-powered and yields valuable ad hoc results, possibly providing an incentive for follow-up studies. We propose our method as a quick screening step prior to imputation-based MA, as well as an additional main approach for studies without available reference data matching the ethnicities of study participants. As a proof of principle, we analyzed six dbGaP Type II Diabetes GWAS and found that the proxy algorithm clearly outperforms naïve MA on the p-value level: for 17 out of 23 we observe an improvement on the p-value level by a factor of more than two, and a maximum improvement by a factor of 2127.ConclusionsYAMAS is an efficient and fast meta-analysis program which offers various methods, including conventional MA as well as inserting proxy-SNPs for missing markers to avoid unnecessary power loss. MA with YAMAS can be readily conducted as YAMAS provides a generic parser for heterogeneous tabulated file formats within the GWAS field and avoids cumbersome setups. In this way, it supplements the meta-analysis process.

[1]  J. Ioannidis,et al.  Meta-Analysis in Genome-Wide Association Datasets: Strategies and Application in Parkinson Disease , 2007, PLoS ONE.

[2]  Eleazar Eskin,et al.  Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. , 2011, American journal of human genetics.

[3]  Marc Vidal,et al.  Ten years of genetics and genomics: what have we achieved and where are we heading? , 2010, Nature Reviews Genetics.

[4]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[5]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[6]  Tania B. Huedo-Medina,et al.  Assessing heterogeneity in meta-analysis: Q statistic or I2 index? , 2006, Psychological methods.

[7]  Ayellet V. Segrè,et al.  Twelve type 2 diabetes susceptibility loci identified through large-scale association analysis , 2010, Nature Genetics.

[8]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[9]  Tim Becker,et al.  INTERSNP: genome-wide interaction analysis guided by a priori information , 2009, Bioinform..

[10]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[11]  J. Marchini,et al.  Genotype imputation for genome-wide association studies , 2010, Nature Reviews Genetics.

[12]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[13]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[14]  Eric Banks,et al.  Comparing strategies to fine-map the association of common SNPs at chromosome 9p21 with type 2 diabetes and myocardial infarction , 2011, Nature Genetics.

[15]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[16]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[17]  W. G. Cochran The combination of estimates from different experiments. , 1954 .

[18]  Montgomery Slatkin,et al.  Epigenetic Inheritance and the Missing Heritability Problem , 2009, Genetics.

[19]  D. Gavaghan,et al.  An evaluation of homogeneity tests in meta-analyses in pain using simulations of individual patient data , 2000, Pain.

[20]  Life Technologies,et al.  A map of human genome variation from population-scale sequencing , 2011 .

[21]  N. Risch,et al.  A comparison of linkage disequilibrium measures for fine-scale mapping. , 1995, Genomics.

[22]  Manuel A. R. Ferreira,et al.  Practical aspects of imputation-driven meta-analysis of genome-wide association studies. , 2008, Human molecular genetics.

[23]  B. Browning,et al.  A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. , 2009, American journal of human genetics.

[24]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[25]  D. Altman,et al.  Measuring inconsistency in meta-analyses , 2003, BMJ : British Medical Journal.

[26]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[27]  G. Abecasis,et al.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.

[28]  B. Maher Personal genomes: The case of the missing heritability , 2008, Nature.

[29]  P. Visscher,et al.  Common SNPs explain a large proportion of heritability for human height , 2011 .

[30]  Alejandro Duran,et al.  The Design of OpenMP Tasks , 2009, IEEE Transactions on Parallel and Distributed Systems.

[31]  Eleazar Eskin,et al.  EMINIM: An Adaptive and Memory-Efficient Algorithm for Genotype Imputation , 2010, J. Comput. Biol..

[32]  Eleazar Eskin,et al.  Imputation aware meta‐analysis of genome‐wide association studies , 2010, Genetic epidemiology.