3 Population-specific genotype imputations using minimac or IMPUTE 2

In order to meaningfully analyze common and rare genetic variants, results from genome-wide association studies (GWASs) of multiple cohorts need to be combined in a meta-analysis in order to obtain enough power. This requires all cohorts to have the same single-nucleotide polymorphisms (SNPs) in their GWASs. To this end, genotypes that have not been measured in a given cohort can be imputed on the basis of a set of reference haplotypes. This protocol provides guidelines for performing imputations with two widely used tools: minimac and Impute [16]. These guidelines were developed and used by the Genome of the Netherlands (GoNL) consortium, which has created a population-specific reference panel for genetic imputations and used this reference to impute various Dutch biobanks. We also describe several factors that might influence the final imputation quality. This protocol, which has been used by the largest Dutch biobanks, should take approximately several days, depending on the sample size of the biobank and the computer resources available.

[1]  Markus Scholz,et al.  fcGENE: A Versatile Tool for Processing and Transforming SNP Datasets , 2014, PloS one.

[2]  P. Donnelly,et al.  A Flexible and Accurate Genotype Imputation Method for the Next Generation of Genome-Wide Association Studies , 2009, PLoS genetics.

[3]  G. Abecasis,et al.  Merlin—rapid analysis of dense genetic maps using sparse gene flow trees , 2002, Nature Genetics.

[4]  J. Marchini,et al.  Genotype imputation for genome-wide association studies , 2010, Nature Reviews Genetics.

[5]  B. Browning,et al.  A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals. , 2009, American journal of human genetics.

[6]  A. Morris,et al.  Data quality control in genetic case-control association studies , 2010, Nature Protocols.

[7]  Cathy C Laurie,et al.  Is 'forward' the same as 'plus'?…and other adventures in SNP allele nomenclature. , 2012, Trends in genetics : TIG.

[8]  Pieter B. T. Neerincx,et al.  Supplementary Information Whole-genome sequence variation , population structure and demographic history of the Dutch population , 2022 .

[9]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[10]  Zoltán Kutalik,et al.  Quality control and conduct of genome-wide association meta-analyses , 2014, Nature Protocols.

[11]  Nilanjan Chatterjee,et al.  Improved imputation of common and uncommon SNPs with a new reference set , 2011, Nature Genetics.

[12]  G. Abecasis,et al.  MaCH: using sequence and genotype data to estimate haplotypes and unobserved genotypes , 2010, Genetic epidemiology.

[13]  O. Delaneau,et al.  Supplementary Information for ‘ Improved whole chromosome phasing for disease and population genetic studies ’ , 2012 .

[14]  B. Browning,et al.  Haplotype phasing: existing methods and new developments , 2011, Nature Reviews Genetics.

[15]  Manuel A. R. Ferreira,et al.  Practical aspects of imputation-driven meta-analysis of genome-wide association studies. , 2008, Human molecular genetics.

[16]  Pieter B. T. Neerincx,et al.  The Genome of the Netherlands: design, and project goals , 2013, European Journal of Human Genetics.

[17]  Pieter B. T. Neerincx,et al.  Genome of the Netherlands population-specific imputations identify an ABCA6 variant associated with cholesterol levels , 2015, Nature communications.

[18]  Luke Jostins,et al.  Imputation of low-frequency variants using the HapMap3 benefits from large, diverse reference sets , 2011, European Journal of Human Genetics.

[19]  Carlo Sidore,et al.  Rare variant genotype imputation with thousands of study-specific whole-genome sequences: implications for cost-effective study designs , 2014, European Journal of Human Genetics.

[20]  GACT: a Genome build and Allele definition Conversion Tool for SNP imputation and meta-analysis in genetic association studies , 2014, BMC Genomics.

[21]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[22]  Li Shen,et al.  The effect of reference panels and software tools on genotype imputation. , 2011, AMIA ... Annual Symposium proceedings. AMIA Symposium.

[23]  Sharon R Grossman,et al.  Integrating common and rare genetic variation in diverse human populations , 2010, Nature.

[24]  J. Marchini,et al.  Fast and accurate genotype imputation in genome-wide association studies through pre-phasing , 2012, Nature Genetics.

[25]  Heorhiy Byelas,et al.  Improved imputation quality of low-frequency and rare variants in European samples using the ‘Genome of The Netherlands' , 2014, European Journal of Human Genetics.

[26]  Marylyn D. Ritchie,et al.  Imputation and quality control steps for combining multiple genome-wide datasets , 2014, Front. Genet..