Genome-wide association studies for complex traits: consensus, uncertainty and challenges

The past year has witnessed substantial advances in understanding the genetic basis of many common phenotypes of biomedical importance. These advances have been the result of systematic, well-powered, genome-wide surveys exploring the relationships between common sequence variation and disease predisposition. This approach has revealed over 50 disease-susceptibility loci and has provided insights into the allelic architecture of multifactorial traits. At the same time, much has been learned about the successful prosecution of association studies on such a scale. This Review highlights the knowledge gained, defines areas of emerging consensus, and describes the challenges that remain as researchers seek to obtain more complete descriptions of the susceptibility architecture of biomedical traits of interest and to translate the information gathered into improvements in clinical management.

[1]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[2]  Jason E. Stewart,et al.  Minimum information about a microarray experiment (MIAME)—toward standards for microarray data , 2001, Nature Genetics.

[3]  D. Moher,et al.  The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration , 2001, Annals of Internal Medicine.

[4]  M. O’Donovan,et al.  DNA Pooling: a tool for large-scale association studies , 2002, Nature Reviews Genetics.

[5]  J. Hirschhorn,et al.  A comprehensive review of genetic association studies , 2002, Genetics in Medicine.

[6]  N E Day,et al.  The detection of gene-environment interaction for continuous traits: should we deal with measurement error by bigger studies or better measurement? , 2003, International journal of epidemiology.

[7]  E. Lander,et al.  Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease , 2003, Nature Genetics.

[8]  S. Ebrahim,et al.  'Mendelian randomization': can genetic epidemiology contribute to understanding environmental determinants of disease? , 2003, International journal of epidemiology.

[9]  Matthias Egger,et al.  The scandal of poor epidemiological research , 2004, BMJ : British Medical Journal.

[10]  M Y Wong,et al.  Estimation of magnitude in gene–environment interactions in the presence of measurement error , 2004, Statistics in medicine.

[11]  Nathaniel Rothman,et al.  Assessing the probability that a positive report is false: an approach for molecular epidemiology studies. , 2004, Journal of the National Cancer Institute.

[12]  S. Gabriel,et al.  Efficiency and power in genetic association studies , 2005, Nature Genetics.

[13]  P. Donnelly,et al.  Genome-wide strategies for detecting multiple loci that influence complex diseases , 2005, Nature Genetics.

[14]  M. Daly,et al.  Genome-wide association studies for common diseases and complex traits , 2005, Nature Reviews Genetics.

[15]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[16]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[17]  D. Clayton,et al.  Population structure, differential bias and genomic control in a large-scale, case-control association study , 2005, Nature Genetics.

[18]  D. Clayton,et al.  Genome-wide association studies: theoretical and practical concerns , 2005, Nature Reviews Genetics.

[19]  M. McCarthy,et al.  An evaluation of HapMap sample size and tagging SNP performance in large-scale empirical and simulated data sets , 2005, Nature Genetics.

[20]  J. Todd,et al.  Comparison of population‐ and family‐based methods for genetic association analysis in the presence of interacting loci , 2005, Genetic epidemiology.

[21]  Jacqueline K. Wittke-Thompson,et al.  Rational inferences about departures from Hardy-Weinberg equilibrium. , 2005, American journal of human genetics.

[22]  Douglas G. Altman,et al.  Elaboración de directrices para la publicación de investigación biomédica: proceso y fundamento científico , 2005 .

[23]  J. Pritchard,et al.  Confounding from Cryptic Relatedness in Case-Control Association Studies , 2005, PLoS genetics.

[24]  D. Moher,et al.  [Developing guidelines for reporting healthcare research: scientific rationale and procedures]. , 2005, Medicina clinica.

[25]  M. Daly,et al.  Transferability of tag SNPs in genetic association studies in multiple populations , 2006, Nature Genetics.

[26]  Andrew P Morris,et al.  A flexible Bayesian framework for modeling haplotype association with disease, allowing for dominance effects of the underlying causative variants. , 2006, American journal of human genetics.

[27]  Muin J. Khoury,et al.  The path from genome-based research to population health: Development of an international public health genomics network , 2006, Genetics in Medicine.

[28]  J. Ioannidis,et al.  Concordance of functional in vitro data and epidemiological associations in complex disease genetics , 2006, Genetics in Medicine.

[29]  L. Gluud Bias in clinical intervention research. , 2006, American journal of epidemiology.

[30]  Chiara Sabatti,et al.  Magnitude and distribution of linkage disequilibrium in population isolates and implications for genome-wide association studies , 2006, Nature Genetics.

[31]  M. Daly,et al.  Evaluating and improving power in whole-genome association studies using fixed marker sets , 2006, Nature Genetics.

[32]  N. Laird,et al.  Family-based designs in the age of large-scale gene-association studies , 2006, Nature Reviews Genetics.

[33]  Xiaolin Wu,et al.  GEL: a novel genotype calling algorithm using empirical likelihood , 2006, Bioinform..

[34]  Lon R Cardon,et al.  Evaluating coverage of genome-wide association studies , 2006, Nature Genetics.

[35]  J. Gastwirth,et al.  Robust genomic control for association studies. , 2006, American journal of human genetics.

[36]  D. Reich,et al.  Principal components analysis corrects for stratification in genome-wide association studies , 2006, Nature Genetics.

[37]  P. Kraft,et al.  Quantification of the Power of Hardy-Weinberg Equilibrium Testing to Detect Genotyping Error , 2006, Human Heredity.

[38]  D. Clayton,et al.  A genome-wide association study of nonsynonymous SNPs identifies a type 1 diabetes locus in the interferon-induced helicase (IFIH1) region , 2006, Nature Genetics.

[39]  M. Khoury,et al.  Does Genetic Testing Really Improve the Prediction of Future Type 2 Diabetes? , 2006, PLoS medicine.

[40]  Judy H. Cho,et al.  A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene , 2006, Science.

[41]  I. Pe’er,et al.  Optimal two‐stage genotyping designs for genome‐wide association scans , 2006, Genetic epidemiology.

[42]  Michael Boehnke,et al.  Efficient study designs for test of genetic association using sibship data and unrelated cases and controls. , 2006, American journal of human genetics.

[43]  Terence P. Speed,et al.  Genome analysis A genotype calling algorithm for affymetrix SNP arrays , 2005 .

[44]  E. Eichler,et al.  Linkage disequilibrium and heritability of copy-number polymorphisms within duplicated regions of the human genome. , 2006, American journal of human genetics.

[45]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[46]  Thomas Lengauer,et al.  A genome-wide association scan of nonsynonymous SNPs identifies a susceptibility variant for Crohn disease in ATG16L1 , 2007, Nature Genetics.

[47]  M. Stephens,et al.  Imputation-Based Analysis of Association Studies: Candidate Regions and Quantitative Traits , 2007, PLoS genetics.

[48]  J. Gulcher,et al.  A variant in CDKAL1 influences insulin response and risk of type 2 diabetes , 2007, Nature Genetics.

[49]  Steven J. Schrodi,et al.  A large-scale genetic association study confirms IL12B and leads to the identification of IL23R as psoriasis-risk genes. , 2007, American journal of human genetics.

[50]  R. A. Bailey,et al.  Robust associations of four new chromosome regions from genome-wide analyses of type 1 diabetes , 2007, Nature Genetics.

[51]  Richa Saxena,et al.  A common variant of HMGA2 is associated with adult and childhood height in the general population , 2007, Nature Genetics.

[52]  Evangelos Evangelou,et al.  Heterogeneity in Meta-Analyses of Genome-Wide Association Investigations , 2007, PloS one.

[53]  T. Hudson,et al.  A genome-wide association study identifies novel risk loci for type 2 diabetes , 2007, Nature.

[54]  Qizhai Li,et al.  Flexible design for following up positive findings. , 2007, American journal of human genetics.

[55]  Jon Wakefield,et al.  A Bayesian measure of the probability of false discovery in genetic epidemiology studies. , 2007, American journal of human genetics.

[56]  P. Fearnhead,et al.  Genome-wide association study of prostate cancer identifies a second risk locus at 8q24 , 2007, Nature Genetics.

[57]  Eric E. Smith,et al.  Variants conferring risk of atrial fibrillation on chromosome 4q25 , 2007, Nature.

[58]  Joseph T. Glessner,et al.  A genome-wide association study identifies KIAA0350 as a type 1 diabetes gene , 2007, Nature.

[59]  L. Almasy,et al.  Discovery of expression QTLs using large-scale transcriptional profiling in human lymphocytes , 2007, Nature Genetics.

[60]  M. McCarthy,et al.  Replication of Genome-Wide Association Signals in UK Samples Reveals Risk Loci for Type 2 Diabetes , 2007, Science.

[61]  D. Gudbjartsson,et al.  Two variants on chromosome 17 confer prostate cancer risk, and the one in TCF2 protects against type 2 diabetes , 2007, Nature Genetics.

[62]  W. Willett,et al.  A genome-wide association study identifies alleles in FGFR2 associated with risk of sporadic postmenopausal breast cancer , 2007, Nature Genetics.

[63]  C. Gieger,et al.  Genomewide association analysis of coronary artery disease. , 2007, The New England journal of medicine.

[64]  L. Liang,et al.  A genome-wide association study of global gene expression , 2007, Nature Genetics.

[65]  D. Koller,et al.  Population genomics of human gene expression , 2007, Nature Genetics.

[66]  Judy H Cho,et al.  Genome-wide association study identifies new susceptibility loci for Crohn disease and implicates autophagy in disease pathogenesis , 2007, Nature Genetics.

[67]  C. Eun A Genome-Wide Association Study Identifies IL23R as an Inflammatory Bowel Disease Gene. , 2007 .

[68]  Jean Yee Hwa Yang,et al.  A multi-array multi-SNP genotyping algorithm for Affymetrix SNP microarrays , 2007, Bioinform..

[69]  G. Abecasis,et al.  Optimal designs for two‐stage genome‐wide association studies , 2007, Genetic epidemiology.

[70]  P. Donnelly,et al.  New models of collaboration in genome-wide association studies: the Genetic Association Information Network , 2007, Nature Genetics.

[71]  Jonathan C. Cohen,et al.  A Common Allele on Chromosome 9 Associated with Coronary Heart Disease , 2007, Science.

[72]  A. Morris,et al.  Fine mapping versus replication in whole-genome association studies. , 2007, American journal of human genetics.

[73]  S. Pocock,et al.  The Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement: guidelines for reporting observational studies. , 2007, Preventive medicine.

[74]  James Strait,et al.  Genome-Wide Association Scan Shows Genetic Variants in the FTO Gene Are Associated with Obesity-Related Traits , 2007, PLoS genetics.

[75]  Peter Donnelly,et al.  Replicating genotype – phenotype associations What constitutes replication of a genotype – phenotype association , and how best can it be achieved ? , 2007 .

[76]  Kari Stefansson,et al.  A common variant on chromosome 9p21 affects the risk of myocardial infarction. , 2007, Science.

[77]  M. Daly,et al.  Guilt beyond a reasonable doubt , 2007, Nature Genetics.

[78]  M. Jarvelin,et al.  A Common Variant in the FTO Gene Is Associated with Body Mass Index and Predisposes to Childhood and Adult Obesity , 2007, Science.

[79]  M. Greene,et al.  The evidence for prostate cancer risk loci at 8q24 grows stronger. , 2007, Journal of the National Cancer Institute.

[80]  J. Hirschhorn,et al.  Genetic model testing and statistical power in population‐based association studies of quantitative traits , 2007, Genetic epidemiology.

[81]  Lester L. Peters,et al.  Genome-wide association study identifies novel breast cancer susceptibility loci , 2007, Nature.

[82]  H. Schäfer,et al.  Including sampling and phenotyping costs into the optimization of two stage designs for genome wide association studies , 2007, Genetic epidemiology.

[83]  Yu Cheng,et al.  Association between two unlinked loci at 8q24 and prostate cancer risk among European Americans. , 2007, Journal of the National Cancer Institute.

[84]  Susan E Hodge,et al.  Non-replication of association studies: “pseudo-failures” to replicate? , 2007, Genetics in Medicine.

[85]  William Stafford Noble,et al.  Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project , 2007, Nature.

[86]  G. Abecasis,et al.  Family-based association tests for genomewide association scans. , 2007, American journal of human genetics.

[87]  Gonçalo R. Abecasis,et al.  Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma , 2007, Nature.

[88]  D. Clayton,et al.  A Method to Address Differential Bias in Genotyping in Large-Scale Association Studies , 2007, PLoS genetics.

[89]  K. Sirotkin,et al.  The NCBI dbGaP database of genotypes and phenotypes , 2007, Nature Genetics.

[90]  Jacques Fellay,et al.  A Whole-Genome Association Study of Major Determinants for Host Control of HIV-1 , 2007, Science.

[91]  John P. A. Ioannidis,et al.  The Emergence of Networks in Human Genome Epidemiology: Challenges and Opportunities , 2007, Epidemiology.

[92]  J. Pritchard,et al.  Overcoming the winner's curse: estimating penetrance parameters from case-control data. , 2007, American journal of human genetics.

[93]  S. Mccarroll,et al.  Copy-number variation and association studies of human disease , 2007, Nature Genetics.

[94]  Marcia M. Nizzari,et al.  Genome-Wide Association Analysis Identifies Loci for Type 2 Diabetes and Triglyceride Levels , 2007, Science.

[95]  Mark Atkinson,et al.  Large-scale genetic fine mapping and genotype-phenotype associations implicate polymorphism in the IL2RA region in type 1 diabetes , 2007, Nature Genetics.

[96]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[97]  P. Donnelly,et al.  Replicating genotype–phenotype associations , 2007, Nature.

[98]  E. Birney,et al.  Challenges and standards in integrating surveys of structural variation , 2007, Nature Genetics.

[99]  J. Gulcher,et al.  Refining the impact of TCF7L2 gene variants on type 2 diabetes and adaptive evolution , 2007, Nature Genetics.

[100]  Michael W. Mahoney,et al.  PCA-Correlated SNPs for Structure Identification in Worldwide Human Populations , 2007, PLoS genetics.

[101]  D. Gudbjartsson,et al.  Genome-wide association study identifies a second prostate cancer susceptibility variant at 8q24 , 2007, Nature Genetics.

[102]  D. Gudbjartsson,et al.  Common variants on chromosomes 2q35 and 16q12 confer susceptibility to estrogen receptor–positive breast cancer , 2007, Nature Genetics.

[103]  Alastair Forbes,et al.  Sequence variants in the autophagy gene IRGM and multiple other replicating loci contribute to Crohn's disease susceptibility , 2007, Nature Genetics.

[104]  Fergus J Couch,et al.  A systematic genetic assessment of 1,433 sequence variants of unknown clinical significance in the BRCA1 and BRCA2 breast cancer-predisposition genes. , 2007, American journal of human genetics.

[105]  Simon Heath,et al.  Novel Crohn Disease Locus Identified by Genome-Wide Association Maps to a Gene Desert on 5p13.1 and Modulates Expression of PTGER4 , 2007, PLoS genetics.

[106]  Sudha Seshadri,et al.  The Framingham Heart Study 100K SNP genome-wide association study resource: overview of 17 phenotype working group reports , 2007, BMC Medical Genetics.

[107]  G. Abecasis,et al.  A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants , 2007, Science.

[108]  S. Bandinelli,et al.  The GLUT9 Gene Is Associated with Serum Uric Acid Levels in Sardinia and Chianti Cohorts , 2007, PLoS genetics.

[109]  Nazneen Rahman,et al.  The emerging landscape of breast cancer susceptibility , 2007, Nature Genetics.

[110]  John P.A. Ioannidis,et al.  Non-Replication and Inconsistency in the Genome-Wide Association Setting , 2007, Human Heredity.

[111]  Joshua M. Korn,et al.  Association between microdeletion and microduplication at 16p11.2 and autism. , 2008, The New England journal of medicine.

[112]  Dolores Corella,et al.  Six new loci associated with blood low-density lipoprotein cholesterol, high-density lipoprotein cholesterol or triglycerides in humans , 2008, Nature Genetics.

[113]  M. McCarthy,et al.  Meta-analysis of genome-wide association data and large-scale replication identifies additional susceptibility loci for type 2 diabetes , 2008, Nature Genetics.

[114]  Ali Amin Al Olama,et al.  Multiple newly identified loci associated with prostate cancer susceptibility , 2008, Nature Genetics.

[115]  James D. Brooks,et al.  Multiple loci identified in a genome-wide association study of prostate cancer , 2008 .

[116]  C. Hoggart,et al.  Genome‐wide significance for dense SNP and resequencing data , 2008, Genetic epidemiology.

[117]  K. Mossman The Wellcome Trust Case Control Consortium, U.K. , 2008 .

[118]  David M. Evans,et al.  Genome-wide association analysis identifies 20 loci that influence adult height , 2008, Nature Genetics.

[119]  Pär Stattin,et al.  Cumulative association of five genetic variants with prostate cancer. , 2008, The New England journal of medicine.

[120]  P. Elliott,et al.  Genome-wide scan identifies variation in MLXIPL associated with plasma triglycerides , 2008, Nature Genetics.

[121]  Siobhan M. Dolan,et al.  Assessment of cumulative evidence on genetic associations: interim guidelines. , 2008, International journal of epidemiology.

[122]  Jacques Fellay,et al.  WGAViewer: software for genomic annotation of whole genome association studies. , 2008, Genome research.

[123]  N. Cook,et al.  Rationale, design, and methodology of the Women's Genome Health Study: a genome-wide association study of more than 25,000 initially healthy american women. , 2008, Clinical chemistry.

[124]  David Reich,et al.  Discerning the Ancestry of European Americans in Genetic Association Studies , 2007, PLoS genetics.

[125]  R. Collins,et al.  Newly identified loci that influence lipid concentrations and risk of coronary artery disease , 2008, Nature Genetics.

[126]  B. Thiers A Large-Scale Genetic Association Study Confirms IL12B and Leads to the Identification of IL23R as Psoriasis-Risk Genes , 2008 .

[127]  Kevin M. Bradley,et al.  Common sequence variants on 2p15 and Xp11.22 confer susceptibility to prostate cancer , 2008, Nature Genetics.

[128]  Pablo Villoslada,et al.  Analysis and Application of European Genetic Substructure Using 300 K SNP Information , 2008, PLoS genetics.

[129]  John P A Ioannidis,et al.  Required sample size and nonreplicability thresholds for heterogeneous genetic associations , 2008, Proceedings of the National Academy of Sciences.

[130]  K. Lunetta Genetic Association Studies , 2008, Circulation.

[131]  Shah Ebrahim,et al.  Common variants in the GDF5-UQCC region are associated with variation in human height , 2008, Nature Genetics.