What Should the Genome-wide Significance Threshold Be? Empirical Replication of Borderline Genetic Associations Yfor a Full List of Investigators Offering Data and Clarifications See Acknowledgments

BACKGROUND Robust replication is a sine qua non for the rigorous documentation of proposed associations in the genome-wide association (GWA) setting. Currently, associations of common variants reaching P ≤ 5 × 10(-8) are considered replicated. However, there is some ambiguity about the most suitable threshold for claiming genome-wide significance. METHODS We defined as 'borderline' associations those with P > 5 × 10(-8) and P ≤ 1 × 10(-7). The eligible associations were retrieved using the 'Catalog of Published Genome-Wide Association Studies'. For each association we assessed whether it reached P ≤ 5 × 10(-8) with inclusion of additional data from subsequent GWA studies. RESULTS Thirty-four eligible genotype-phenotype associations were evaluated with data and clarifications contributed from diverse investigators. Replication data from subsequent GWA studies could be obtained for 26 of them. Of those, 19 associations (73%) reached P ≤ 5 × 10(-8) for the same or a related trait implicating either the exact same allele or one in very high linkage disequilibrium and 17 reached P < 10(-8). If the seven associations that did not reach P ≤ 5 × 10(-8) when additional data were considered are assumed to have been false-positives, the false-discovery rate for borderline associations is estimated to be 27% [95% confidence interval (CI) 12-48%]. For five associations, the current P-value is > 10(-6) [corresponding false-discovery rate 19% (95% CI 7-39%)]. CONCLUSION A substantial proportion, but not all, of the associations with borderline genome-wide significance represent replicable, possibly genuine associations. Our empirical evaluation suggests a possible relaxation in the current GWS threshold.

[1]  A. Tenesa Heritability of complex human diseases in the UK Biobank , 2014 .

[2]  G. Proctor,et al.  Clinical assessment , 2014, BDJ.

[3]  M. Marazita,et al.  Genome-wide Association Studies , 2012, Journal of dental research.

[4]  A. Ludolph,et al.  Amyotrophic lateral sclerosis. , 2012, Current opinion in neurology.

[5]  F. Cambien Heritability, weak effects, and rare variants in genomewide association studies. , 2011, Clinical Chemistry.

[6]  J. Ioannidis,et al.  The False-positive to False-negative Ratio in Epidemiologic Studies , 2011, Epidemiology.

[7]  Eleftheria Zeggini,et al.  The effect of next-generation sequencing technology on complex trait research , 2011, European journal of clinical investigation.

[8]  E. Zeggini Next-generation association studies for complex traits , 2011, Nature Genetics.

[9]  M. Baur,et al.  Significance Levels in Genome‐Wide Interaction Analysis (GWIA) , 2011, Annals of human genetics.

[10]  Wendy A. Wolf,et al.  The eMERGE Network: A consortium of biorepositories linked to electronic medical records data for conducting genomic studies , 2011, BMC Medical Genomics.

[11]  Stephen J O'Brien,et al.  Accounting for multiple comparisons in a genome-wide association study (GWAS) , 2010, BMC Genomics.

[12]  Pak Chung Sham,et al.  Estimating the Total Number of Susceptibility Variants Underlying Complex Diseases from Genome-Wide Association Studies , 2010, PloS one.

[13]  J. Ioannidis,et al.  Genome-wide significant associations for variants with minor allele frequency of 5% or less--an overview: A HuGE review. , 2010, American journal of epidemiology.

[14]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[15]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[16]  Thomas Meitinger,et al.  Genome-wide association study of migraine implicates a common susceptibility variant on 8q22.1 , 2010, Nature Genetics.

[17]  Tanya M. Teslovich,et al.  Biological, Clinical, and Population Relevance of 95 Loci for Blood Lipids , 2010, Nature.

[18]  Nilanjan Chatterjee,et al.  Estimation of effect size distribution from genome-wide association studies and implications for future discoveries , 2010, Nature Genetics.

[19]  E. Liu,et al.  A genome-wide association study of nasopharyngeal carcinoma identifies three new susceptibility loci , 2010, Nature Genetics.

[20]  Jing Cui,et al.  Genome-wide association study meta-analysis identifies seven new rheumatoid arthritis risk loci , 2010, Nature Genetics.

[21]  Alexander A. Morgan,et al.  Clinical assessment incorporating a personal genome , 2010, The Lancet.

[22]  S. Wild,et al.  Linkage and Genome‐wide Association Analysis of Obesity‐related Phenotypes: Association of Weight With the MGAT1 Gene , 2010, Obesity.

[23]  D. Thomas,et al.  Gene–environment-wide association studies: emerging approaches , 2010, Nature Reviews Genetics.

[24]  R. Spritz,et al.  Genome-wide association study of generalized vitiligo in an isolated European founder population identifies SMOC2, in close proximity to IDDM8. , 2010, The Journal of investigative dermatology.

[25]  John P A Ioannidis,et al.  Beyond genome-wide association studies: genetic heterogeneity and individual predisposition to cancer. , 2010, Trends in genetics : TIG.

[26]  Yusuke Nakamura,et al.  Genome-wide association study of hematological and biochemical traits in a Japanese population , 2010, Nature Genetics.

[27]  K. Lange,et al.  Prioritizing GWAS results: A review of statistical methods and recommendations for their application. , 2010, American journal of human genetics.

[28]  R. Prentice,et al.  Correcting “winner's curse” in odds ratios from genomewide association findings for major complex human diseases , 2009, Genetic epidemiology.

[29]  Stephanie A. Santorico,et al.  Methods for combining multiple genome-wide linkage studies. , 2010, Methods in molecular biology.

[30]  S. Cichon,et al.  Genome-wide association study identifies two susceptibility loci for nonsyndromic cleft lip with or without cleft palate , 2010, Nature Genetics.

[31]  A. Hofman,et al.  Genome-wide association study of migraine implicates a common susceptibility variant on 8 q 22 . 1 , 2010 .

[32]  R. Spritz,et al.  Genome-Wide Association Study of Generalized Vitiligo in an Isolated European Founder Population Identifies SMOC 2 , in Close Proximity to IDDM 8 , 2010 .

[33]  D. Goldstein Common genetic variation and human traits. , 2009, The New England journal of medicine.

[34]  Joseph T. Glessner,et al.  A genome-wide association study identifies a locus for nonsyndromic cleft lip with or without cleft palate on 8q24. , 2009, The Journal of pediatrics.

[35]  H. Boezen,et al.  Genome-wide association studies: what do they teach us about asthma and chronic obstructive pulmonary disease? , 2009, Proceedings of the American Thoracic Society.

[36]  John P A Ioannidis,et al.  Discovery properties of genome-wide association signals from cumulatively combined data sets. , 2009, American journal of epidemiology.

[37]  Manuel A. R. Ferreira,et al.  Common variants in the trichohyalin gene are associated with straight hair in Europeans. , 2009, American journal of human genetics.

[38]  Christian Gieger,et al.  A genome-wide meta-analysis identifies 22 loci associated with eight hematological parameters in the HaemGen consortium , 2009, Nature Genetics.

[39]  Peter Kraft,et al.  Replication in genome-wide association studies. , 2009, Statistical science : a review journal of the Institute of Mathematical Statistics.

[40]  Larry Wasserman,et al.  Genome-Wide Significance Levels and Weighted Hypothesis Testing. , 2009, Statistical science : a review journal of the Institute of Mathematical Statistics.

[41]  Sonja W. Scholz,et al.  Genome-Wide Association Study reveals genetic risk underlying Parkinson’s disease , 2009, Nature Genetics.

[42]  Judy H. Cho,et al.  Finding the missing heritability of complex diseases , 2009, Nature.

[43]  Geoffrey S. Tobias,et al.  Genome-wide association study identifies variants in the ABO locus associated with susceptibility to pancreatic cancer , 2009, Nature Genetics.

[44]  Ngan-Ming Tsang,et al.  Genome-wide association study reveals multiple nasopharyngeal carcinoma-associated loci within the HLA region at chromosome 6p21.3. , 2009, American journal of human genetics.

[45]  Yusuke Nakamura,et al.  A genome-wide association study identifies ITGA9 conferring risk of nasopharyngeal carcinoma , 2009, Journal of Human Genetics.

[46]  Ludwig Kappos,et al.  Meta-analysis of genome scans and replication identify CD6, IRF8 and TNFRSF1A as new multiple sclerosis susceptibility loci , 2009, Nature Genetics.

[47]  Jianxin Shi,et al.  Common variants on chromosome 6p22.1 are associated with schizophrenia , 2009, Nature.

[48]  F. Collins,et al.  Potential etiologic and functional implications of genome-wide association loci for human diseases and traits , 2009, Proceedings of the National Academy of Sciences.

[49]  Siobhan M. Dolan,et al.  Genome-Wide Association Studies, Field Synopses, and the Development of the Knowledge Base on Genetic Variation and Human Diseases , 2009, American journal of epidemiology.

[50]  Casey S. Greene,et al.  Failure to Replicate a Genetic Association May Provide Important Clues About Genetic Architecture , 2009, PloS one.

[51]  Suzette J. Bielinski,et al.  NRXN3 Is a Novel Locus for Waist Circumference: A Genome-Wide Association Study from the CHARGE Consortium , 2009, PLoS genetics.

[52]  Inês Barroso,et al.  Meta-Analysis of Genome-Wide Scans for Human Adult Stature Identifies Novel Loci and Associations with Measures of Skeletal Frame Size , 2009, PLoS genetics.

[53]  John P A Ioannidis,et al.  Meta-analysis in genome-wide association studies. , 2009, Pharmacogenomics.

[54]  Michael Wolf,et al.  Multiple Testing , 2009 .

[55]  Andrew D. Johnson,et al.  SNAP: a web-based tool for identification and annotation of proxy SNPs using HapMap , 2008, Bioinform..

[56]  P. Ridker,et al.  Novel Association of HK1 with Glycated Hemoglobin in a Non-Diabetic Population: A Genome-Wide Evaluation of 14,618 Participants in the Women's Genome Health Study , 2008, PLoS genetics.

[57]  Susanne Walitza,et al.  Molecular genetics of adult ADHD: converging evidence from genome-wide association and extended pedigree linkage studies , 2008, Journal of Neural Transmission.

[58]  Jing Cui,et al.  Common variants at CD40 and other loci confer risk of rheumatoid arthritis , 2008, Nature Genetics.

[59]  Peter Kraft,et al.  Curses--winner's and otherwise--in genetic epidemiology. , 2008, Epidemiology.

[60]  Manuel A. R. Ferreira,et al.  Collaborative genome-wide association analysis supports a role for ANK3 and CACNA1C in bipolar disorder , 2008, Nature Genetics.

[61]  M. Daly,et al.  Estimation of the multiple testing burden for genomewide association studies of nearly all common variants , 2008, Genetic epidemiology.

[62]  Bjarni V. Halldórsson,et al.  Many sequence variants affecting diversity of adult human height , 2008, Nature Genetics.

[63]  David M. Evans,et al.  Genome-wide association analysis identifies 20 loci that influence adult height , 2008, Nature Genetics.

[64]  F. Hu,et al.  A Genome-Wide Association Study Identifies Novel Alleles Associated with Hair Color and Skin Pigmentation , 2008, PLoS genetics.

[65]  M. McCarthy,et al.  Genome-wide association studies for complex traits: consensus, uncertainty and challenges , 2008, Nature Reviews Genetics.

[66]  David S Sanders,et al.  Newly identified genetic risk variants for celiac disease related to the immune response , 2008, Nature Genetics.

[67]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[68]  B. Efron SIMULTANEOUS INFERENCE : WHEN SHOULD HYPOTHESIS TESTING PROBLEMS BE COMBINED? , 2008, 0803.3863.

[69]  C. Hoggart,et al.  Genome‐wide significance for dense SNP and resequencing data , 2008, Genetic epidemiology.

[70]  R. Collins,et al.  Newly identified loci that influence lipid concentrations and risk of coronary artery disease , 2008, Nature Genetics.

[71]  John P A Ioannidis,et al.  Required sample size and nonreplicability thresholds for heterogeneous genetic associations , 2008, Proceedings of the National Academy of Sciences.

[72]  Dolores Corella,et al.  Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans , 2008, Nature Genetics.

[73]  M. Daly,et al.  Common variants at CD 40 and other loci confer risk of rheumatoid arthritis , 2008 .

[74]  Nick Orr,et al.  Common genetic variation and human disease. , 2008, Advances in genetics.

[75]  Frank Baas,et al.  Genetic variation in DPP6 is associated with susceptibility to amyotrophic lateral sclerosis , 2008, Nature Genetics.

[76]  Florence Demenais,et al.  A two-step multiple-marker strategy for genome-wide association studies , 2007, BMC proceedings.

[77]  Xiwu Lin,et al.  Genome-wide association studies using an adaptive two-stage analysis for a case-control design , 2007, BMC proceedings.

[78]  K. Taylor,et al.  Genome-Wide Association , 2007, Diabetes.

[79]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[80]  Qiong Yang,et al.  Genome-wide association and linkage analyses of hemostatic factors and hematological phenotypes in the Framingham Heart Study , 2007, BMC Medical Genetics.

[81]  Daniel Levy,et al.  A genome-wide association study of breast and prostate cancer in the NHLBI's Framingham Heart Study , 2007, BMC Medical Genetics.

[82]  Evangelos Evangelou,et al.  Heterogeneity in Meta-Analyses of Genome-Wide Association Investigations , 2007, PloS one.

[83]  Iuliana Ionita-Laza,et al.  Genomewide weighted hypothesis testing in family-based association studies, with an application to a 100K scan. , 2007, American journal of human genetics.

[84]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[85]  P. Donnelly,et al.  New models of collaboration in genome-wide association studies: the Genetic Association Information Network , 2007, Nature Genetics.

[86]  C. Gieger,et al.  Genomewide association analysis of coronary artery disease. , 2007, The New England journal of medicine.

[87]  Steven Gallinger,et al.  Genome-wide association scan identifies a colorectal cancer susceptibility locus on chromosome 8q24 , 2007, Nature Genetics.

[88]  D. Gudbjartsson,et al.  Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor–positive breast cancer , 2007, Nature Genetics.

[89]  P. Donnelly,et al.  Replicating genotype–phenotype associations , 2007, Nature.

[90]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[91]  John P.A. Ioannidis,et al.  Non-Replication and Inconsistency in the Genome-Wide Association Setting , 2007, Human Heredity.

[92]  M. McCarthy,et al.  Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes , 2007, Science.

[93]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[94]  Susan E Hodge,et al.  Non-replication of association studies: “pseudo-failures” to replicate? , 2007, Genetics in Medicine.

[95]  G. Abecasis,et al.  A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants , 2007, Science.

[96]  J. Pritchard,et al.  Overcoming the winner's curse: estimating penetrance parameters from case-control data. , 2007, American journal of human genetics.

[97]  T. Hudson,et al.  A genome-wide association study identifies novel risk loci for type 2 diabetes , 2007, Nature.

[98]  John P. A. Ioannidis,et al.  The Emergence of Networks in Human Genome Epidemiology: Challenges and Opportunities , 2007, Epidemiology.

[99]  Jianxin Shi,et al.  Multiple testing in the genomics era: Findings from Genetic Analysis Workshop 15, Group 15 , 2007, Genetic epidemiology.

[100]  D. Blacker,et al.  Systematic meta-analyses of Alzheimer disease genetic association studies: the AlzGene database , 2007, Nature Genetics.

[101]  John P A Ioannidis,et al.  Common genetic variants for breast cancer: 32 largely refuted candidates and larger prospects. , 2006, Journal of the National Cancer Institute.

[102]  Lon R Cardon,et al.  Evaluating coverage of genome-wide association studies , 2006, Nature Genetics.

[103]  Houeto Jean-Luc [Parkinson's disease]. , 2022, La Revue du praticien.

[104]  Mariza de Andrade,et al.  High-resolution whole-genome association study of Parkinson disease. , 2005, American journal of human genetics.

[105]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[106]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[107]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[108]  Thomas A Trikalinos,et al.  Genetic associations in large versus small studies: an empirical assessment , 2003, The Lancet.

[109]  J. Hirschhorn,et al.  A comprehensive review of genetic association studies , 2002, Genetics in Medicine.

[110]  J. Ioannidis,et al.  Replication validity of genetic association studies , 2001, Nature Genetics.

[111]  J Ludbrook,et al.  MULTIPLE COMPARISON PROCEDURES UPDATED , 1998, Clinical and experimental pharmacology & physiology.

[112]  S. Voss,et al.  Multiple significance tests , 1995, BMJ.

[113]  D. Altman,et al.  Multiple significance tests: the Bonferroni method. , 1995, BMJ.

[114]  Larry V. Hedges,et al.  Testing the Null Hypothesis in Meta-Analysis: A Comparison of Combined Probability and Confidence Interval Procedures , 1992 .

[115]  D. Cox,et al.  Statistical significance tests. , 1982, British journal of clinical pharmacology.