Sequence variations in the public human genome data reflect a bottlenecked population history

Single-nucleotide polymorphisms (SNPs) constitute the great majority of variations in the human genome, and as heritable variable landmarks they are useful markers for disease mapping and resolving population structure. Redundant coverage in overlaps of large-insert genomic clones, sequenced as part of the Human Genome Project, comprises a quarter of the genome, and it is representative in terms of base compositional and functional sequence features. We mined these regions to produce 500,000 high-confidence SNP candidates as a uniform resource for describing nucleotide diversity and its regional variation within the genome. Distributions of marker density observed at different overlap length scales under a model of recombination and population size change show that the history of the population represented by the public genome sequence is one of collapse followed by a recent phase of mild size recovery. The inferred times of collapse and recovery are Upper Paleolithic, in agreement with archaeological evidence of the initial modern human colonization of Europe.

[1]  Gabor T. Marth,et al.  A general approach to single-nucleotide polymorphism discovery , 1999, Nature Genetics.

[2]  M. Jobling,et al.  Y-chromosome mismatch distributions in Europe. , 2001, Molecular biology and evolution.

[3]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[4]  G. A. Watterson On the number of segregating sites in genetical models without recombination. , 1975, Theoretical population biology.

[5]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[6]  S. Gabriel,et al.  The Structure of Haplotype Blocks in the Human Genome , 2002, Science.

[7]  M. Kimura Evolutionary Rate at the Molecular Level , 1968, Nature.

[8]  P. Green,et al.  Consed: a graphical tool for sequence finishing. , 1998, Genome research.

[9]  D. Goldstein,et al.  Population genomics: Linkage disequilibrium holds the key , 2001, Current Biology.

[10]  H. Harpending,et al.  Genetic perspectives on human origins and differentiation. , 2000, Annual review of genomics and human genetics.

[11]  L. Partridge,et al.  Oxford Surveys in Evolutionary Biology , 1991 .

[12]  P. Kwok,et al.  Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms. , 1998, Genome research.

[13]  M. Nachman,et al.  Estimate of the mutation rate per nucleotide in humans. , 2000, Genetics.

[14]  S. Sherry,et al.  Alu evolution in human populations: using the coalescent to estimate effective population size. , 1997, Genetics.

[15]  Richard R. Hudson,et al.  Generating samples under a Wright-Fisher neutral model of genetic variation , 2002, Bioinform..

[16]  B. Bainbridge,et al.  Genetics , 1981, Experientia.

[17]  M. Daly,et al.  A map of human genome sequence variation containing 1.42 million single nucleotide polymorphisms , 2001, Nature.

[18]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[19]  Christopher J. Lee,et al.  Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences , 2000, Nature Genetics.

[20]  P. Kwok,et al.  Regions of low single-nucleotide polymorphism incidence in human and orangutan xq: deserts and recent coalescences. , 2001, Genomics.

[21]  K H Buetow,et al.  Expression-based genetic/physical maps of single-nucleotide polymorphisms identified by the cancer genome anatomy project. , 2000, Genome research.

[22]  P Bork,et al.  SNP frequencies in human genes an excess of rare alleles and differing modes of selection. , 2000, Trends in genetics : TIG.

[23]  Jeremy Heil,et al.  Human diallelic insertion/deletion polymorphisms. , 2002, American journal of human genetics.

[24]  M. Spence,et al.  Analysis of human genetic linkage , 1986 .

[25]  Pui-Yan Kwok,et al.  Single-nucleotide polymorphisms in the public domain: how useful are they? , 2001, Nature Genetics.

[26]  M Kimmel,et al.  Signatures of population expansion in microsatellite repeat data. , 1998, Genetics.

[27]  K. Sirotkin,et al.  dbSNP-database for single nucleotide polymorphisms and other classes of minor genetic variation. , 1999, Genome research.

[28]  Adoum H. Mahamat,et al.  A new hominid from the Upper Miocene of Chad, Central Africa , 2002, Nature.

[29]  G. D. Wilson,et al.  An SNP map of human chromosome 22 , 2000, Nature.

[30]  M. Nachman,et al.  Single nucleotide polymorphisms and recombination rate in humans. , 2001, Trends in genetics : TIG.

[31]  P. Deloukas,et al.  Comparison of human genetic and sequence-based physical maps , 2001, Nature.

[32]  R. Quatrano Genomics , 1998, Plant Cell.

[33]  Pardis C Sabeti,et al.  Linkage disequilibrium in the human genome , 2001, Nature.

[34]  A. Jeffreys,et al.  Intensely punctate meiotic recombination in the class II region of the major histocompatibility complex , 2001, Nature Genetics.

[35]  Eric S. Lander,et al.  An SNP map of the human genome generated by reduced representation shotgun sequencing , 2000, Nature.