On Quality Control Measures in Genome-Wide Association Studies: A Test to Assess the Genotyping Quality of Individual Probands in Family-Based Association Studies and an Application to the HapMap Data

Allele transmissions in pedigrees provide a natural way of evaluating the genotyping quality of a particular proband in a family-based, genome-wide association study. We propose a transmission test that is based on this feature and that can be used for quality control filtering of genome-wide genotype data for individual probands. The test has one degree of freedom and assesses the average genotyping error rate of the genotyped SNPs for a particular proband. As we show in simulation studies, the test is sufficiently powerful to identify probands with an unreliable genotyping quality that cannot be detected with standard quality control filters. This feature of the test is further exemplified by an application to the third release of the HapMap data. The test is ideally suited as the final layer of quality control filters in the cleaning process of genome-wide association studies. It identifies probands with insufficient genotyping quality that were not removed by standard quality control filtering.

[1]  K. Cheng,et al.  A Simple and Robust TDT-Type Test against Genotyping Error with Error Rates Varying across Families , 2007, Human Heredity.

[2]  T. Hudson,et al.  A genome-wide association study identifies novel risk loci for type 2 diabetes , 2007, Nature.

[3]  Zhaohui S. Qin,et al.  A second generation human haplotype map of over 3.1 million SNPs , 2007, Nature.

[4]  Daniel Rabinowitz,et al.  A Unified Approach to Adjusting Association Tests for Population Admixture with Arbitrary Pedigree Structure and Arbitrary Missing Marker Information , 2000, Human Heredity.

[5]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[6]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[7]  Judy H. Cho,et al.  A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene , 2006, Science.

[8]  E. Martin,et al.  A test for linkage and association in general pedigrees: the pedigree disequilibrium test. , 2000, American journal of human genetics.

[9]  J. Ott,et al.  Power and Sample Size Calculations for Case-Control Genetic Association Tests when Errors Are Present: Application to Single Nucleotide Polymorphisms , 2002, Human Heredity.

[10]  Michael Boehnke,et al.  Probability of detection of genotyping errors and mutations as inheritance inconsistencies in nuclear-family data. , 2002, American journal of human genetics.

[11]  Aravinda Chakravarti,et al.  Undetected genotyping errors cause apparent overtransmission of common alleles in the transmission/disequilibrium test. , 2003, American journal of human genetics.

[12]  F. Hu,et al.  A Common Genetic Variant Is Associated with Adult and Childhood Obesity , 2006, Science.

[13]  Lester L. Peters,et al.  Genome-wide association study identifies novel breast cancer susceptibility loci , 2007, Nature.

[14]  D. Gudbjartsson,et al.  Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24 , 2007, Nature Genetics.

[15]  Xin Xu,et al.  Implementing a unified approach to family‐based tests of association , 2000, Genetic epidemiology.

[16]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[17]  J. Ott,et al.  A transmission/disequilibrium test that allows for genotyping errors in the analysis of single-nucleotide polymorphism data. , 2001, American journal of human genetics.

[18]  M. Daly,et al.  Evaluating and improving power in whole-genome association studies using fixed marker sets , 2006, Nature Genetics.

[19]  Judy H Cho,et al.  Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis , 2007, Nature Genetics.

[20]  Jeanette C Papp,et al.  Detection and integration of genotyping errors in statistical genetics. , 2002, American journal of human genetics.

[21]  J. Ott,et al.  Complement Factor H Polymorphism in Age-Related Macular Degeneration , 2005, Science.

[22]  Y. Teo,et al.  Common statistical issues in genome-wide association studies: a review on power, data quality control, genotype calling and population structure , 2008, Current opinion in lipidology.

[23]  W. Willett,et al.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer , 2007, Nature Genetics.

[24]  W. Ewens,et al.  Transmission test for linkage disequilibrium: the insulin gene region and insulin-dependent diabetes mellitus (IDDM). , 1993, American journal of human genetics.

[25]  R. Mägi,et al.  Evaluating the performance of commercial whole-genome marker sets for capturing common genetic variation , 2007, BMC Genomics.

[26]  M. McCarthy,et al.  Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes , 2007, Science.

[27]  P. Fearnhead,et al.  Genome-wide association study of prostate cancer identifies a second risk locus at 8q24 , 2007, Nature Genetics.

[28]  Lon R Cardon,et al.  Evaluating coverage of genome-wide association studies , 2006, Nature Genetics.

[29]  G. Abecasis,et al.  A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants , 2007, Science.

[30]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[31]  P. Donnelly,et al.  New models of collaboration in genome-wide association studies: the Genetic Association Information Network , 2007, Nature Genetics.

[32]  Stephen J Finch,et al.  What SNP genotyping errors are most costly for genetic association studies? , 2004, Genetic epidemiology.