A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms

We describe a map of 1.42 million single nucleotide polymorphisms (SNPs) distributed throughout the human genome, providing an average density on available sequence of one SNP every 1.9 kilobases. These SNPs were primarily discovered by two projects: The SNP Consortium and the analysis of clone overlaps by the International Human Genome Sequencing Consortium. The map integrates all publicly available SNPs with described genes and other genomic features. We estimate that 60,000 SNPs fall within exon (coding and untranslated regions), and 85% of exons are within 5 kb of the nearest SNP. Nucleotide diversity varies greatly across the genome, in a manner broadly consistent with a standard population genetic model of human history. This high-density SNP map provides a public resource for defining haplotype variation across the genome, and should help to identify biomedically important genes for diagnosis and therapy.

[1]  F. Tajima Evolutionary relationship of DNA sequences in finite populations. , 1983, Genetics.

[2]  L. Partridge,et al.  Oxford Surveys in Evolutionary Biology , 1991 .

[3]  W. Brown,et al.  Hypervariable telomeric sequences from the human sex chromosomes are pseudoautosomal , 1985, Nature.

[4]  Collins Fs,et al.  Of needles and haystacks: finding human disease genes by positional cloning. , 1991 .

[5]  Wen-Hsiung Li,et al.  Low nucleotide diversity in man. , 1991, Genetics.

[6]  F. Collins,et al.  Of needles and haystacks: finding human disease genes by positional cloning. , 1991, Clinical research.

[7]  E. Lander The New Genomics: Global Views of Biology , 1996, Science.

[8]  N Risch,et al.  The Future of Genetic Studies of Complex Human Diseases , 1996, Science.

[9]  R. W. Davis,et al.  Detection of numerous Y chromosome biallelic polymorphisms by denaturing high-performance liquid chromatography. , 1997, Genome research.

[10]  Francis S. Collins,et al.  Variations on a Theme: Cataloging Human DNA Sequence Variation , 1997, Science.

[11]  P. Kwok,et al.  Single nucleotide polymorphism hunting in cyberspace , 1998, Human mutation.

[12]  M Kimmel,et al.  Signatures of population expansion in microsatellite repeat data. , 1998, Genetics.

[13]  S Beck,et al.  Large-scale sequence comparisons reveal unusually high levels of variation in the HLA-DQB1 locus in the class II region of the human MHC. , 1998, Journal of molecular biology.

[14]  C. Nusbaum,et al.  Large-scale identification, mapping, and genotyping of single-nucleotide polymorphisms in the human genome. , 1998, Science.

[15]  E. Boerwinkle,et al.  DNA sequence diversity in a 9.7-kb region of the human lipoprotein lipase gene , 1998, Nature Genetics.

[16]  L. Brooks,et al.  A DNA polymorphism discovery resource for research on human genetic variation. , 1998, Genome research.

[17]  M. Nachman,et al.  DNA variability and recombination rates at X-linked loci in humans. , 1998, Genetics.

[18]  D. Goldstein,et al.  Genetic evidence for a Paleolithic human population expansion in Africa. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[19]  L Tiret,et al.  Sequence diversity in 36 candidate genes for cardiovascular disorders. , 1999, American journal of human genetics.

[20]  M. Cargill Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999, Nature Genetics.

[21]  N. Shen,et al.  Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis , 1999, Nature Genetics.

[22]  T. Ideker,et al.  Mining SNPs from EST databases. , 1999, Genome research.

[23]  M. Rieder,et al.  Sequence variation in the human angiotensin converting enzyme , 1999, Nature Genetics.

[24]  N E Morton,et al.  Genetic epidemiology of single-nucleotide polymorphisms. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[25]  L. Kruglyak Prospects for whole-genome linkage disequilibrium mapping of common disease genes , 1999, Nature Genetics.

[26]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999 .

[27]  Francis S. Collins,et al.  Erratum: A DNA polymorphism discovery resource for research on human genetic variation (Genome Research (1998) 8 (1229-1231)) , 1999 .

[28]  Gabor T. Marth,et al.  A general approach to single-nucleotide polymorphism discovery , 1999, Nature Genetics.

[29]  Michael N. Edmonson,et al.  Reliable identification of large numbers of candidate SNPs from public EST data , 1999, Nature Genetics.

[30]  G. D. Wilson,et al.  An SNP map of human chromosome 22 , 2000, Nature.

[31]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[32]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.

[33]  R. W. Davis,et al.  Population genetic implications from sequence variation in four Y chromosome genes. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[34]  L. Kruglyak,et al.  Sampling SNPs , 2000, Nature Genetics.

[35]  Pui-Yan Kwok,et al.  Juxtaposed regions of extensive and minimal linkage disequilibrium in human Xq25 and Xq28 , 2000, Nature Genetics.

[36]  E. Boerwinkle,et al.  Cladistic structure within the human Lipoprotein lipase gene and its implications for phenotypic association studies. , 2000, Genetics.

[37]  E. Danforth Failure of adipocyte differentiation causes type II diabetes mellitus? , 2000, Nature Genetics.

[38]  John A. Todd,et al.  The genetically isolated populations of Finland and Sardinia may not be a panacea for linkage disequilibrium mapping of common disease genes , 2000, Nature Genetics.

[39]  Helen Skaletsky,et al.  Unexpectedly similar rates of nucleotide substitution found in male and female hominids , 2000, Nature.

[40]  L. Jorde,et al.  Linkage disequilibrium and the search for complex disease genes. , 2000, Genome research.

[41]  K. Katz,et al.  Introducing RefSeq and LocusLink: curated human genome resources at the NCBI. , 2000, Trends in genetics : TIG.

[42]  Christopher J. Lee,et al.  Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences , 2000, Nature Genetics.

[43]  E. Boerwinkle,et al.  Apolipoprotein E variation at the sequence haplotype level: implications for the origin and maintenance of a major human polymorphism. , 2000, American journal of human genetics.

[44]  Eric S. Lander,et al.  Large-scale discovery and genotyping of single-nucleotide polymorphisms in the mouse , 2000, Nature Genetics.

[45]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[46]  P. Kwok,et al.  Regions of low single-nucleotide polymorphism incidence in human and orangutan xq: deserts and recent coalescences. , 2001, Genomics.

[47]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[48]  Pui-Yan Kwok,et al.  Single-nucleotide polymorphisms in the public domain: how useful are they? , 2001, Nature Genetics.

[49]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.