Additional SNPs and linkage-disequilibrium analyses are necessary for whole-genome association studies in humans

More than 5 million single-nucleotide polymorphisms (SNPs) with minor-allele frequency greater than 10% are expected to exist in the human genome. Some of these SNPs may be associated with risk of developing common diseases. To assess the power of currently available SNPs to detect such associations, we resequenced 50 genes in two ethnic samples and measured patterns of linkage disequilibrium between the subset of SNPs reported in dbSNP and the complete set of common SNPs. Our results suggest that using all 2.7 million SNPs currently in the database would detect nearly 80% of all common SNPs in European populations but only 50% of those common in the African American population and that efficient selection of a minimal subset of SNPs for use in association studies requires measurement of allele frequency and linkage disequilibrium relationships for all SNPs in dbSNP.

[1]  E. Lander The New Genomics: Global Views of Biology , 1996, Science.

[2]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999 .

[3]  Gabor T. Marth,et al.  A general approach to single-nucleotide polymorphism discovery , 1999, Nature Genetics.

[4]  D. Nickerson,et al.  Variation is the spice of life , 2001, Nature Genetics.

[5]  Christopher J. Lee,et al.  Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences , 2000, Nature Genetics.

[6]  S. Gabriel,et al.  Quality and completeness of SNP databases , 2003, Nature Genetics.

[7]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.

[8]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[9]  N. Shen,et al.  Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis , 1999, Nature Genetics.

[10]  L. Brooks,et al.  A DNA polymorphism discovery resource for research on human genetic variation. , 1998, Genome research.

[11]  J. Mullikin,et al.  SSAHA: a fast search method for large DNA databases. , 2001, Genome research.

[12]  Francis S. Collins,et al.  Variations on a Theme: Cataloging Human DNA Sequence Variation , 1997, Science.

[13]  Pui-Yan Kwok,et al.  Single-nucleotide polymorphisms in the public domain: how useful are they? , 2001, Nature Genetics.

[14]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[15]  J. Pritchard,et al.  Linkage disequilibrium in humans: models and data. , 2001, American journal of human genetics.

[16]  T Foitzi,et al.  Allelic discrimination using fluorogenic probes and the 5' nuclease assay , 1999 .

[17]  L Tiret,et al.  Sequence diversity in 36 candidate genes for cardiovascular disorders. , 1999, American journal of human genetics.

[18]  Michael N. Edmonson,et al.  Reliable identification of large numbers of candidate SNPs from public EST data , 1999, Nature Genetics.

[19]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[20]  K. Roeder,et al.  Disequilibrium mapping: composite likelihood for pairwise disequilibrium. , 1996, Genomics.

[21]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[22]  L. Kruglyak Prospects for whole-genome linkage disequilibrium mapping of common disease genes , 1999, Nature Genetics.

[23]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[24]  W. G. Hill,et al.  Estimation of linkage disequilibrium in randomly mating populations , 1974, Heredity.