Abundant raw material for cis-regulatory evolution in humans.

Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.

[1]  R. Lewontin,et al.  The Genetic Basis of Evolutionary Change , 2022 .

[2]  J. Stone,et al.  Rapid evolution of cis-regulatory sequences via local point mutations. , 2001, Molecular biology and evolution.

[3]  M. Nachman,et al.  Estimate of the mutation rate per nucleotide in humans. , 2000, Genetics.

[4]  J. Blackwell,et al.  Evidence for a functional repeat polymorphism in the promoter of the human NRAMP1 gene that correlates with autoimmune versus infectious disease susceptibility , 1999, Journal of medical genetics.

[5]  C. Tournamille,et al.  Disruption of a GATA motif in the Duffy gene promoter abolishes erythroid gene expression in Duffy–negative individuals , 1995, Nature Genetics.

[6]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[7]  A Chakravarti,et al.  Patterns of genetic variation in Mendelian and complex traits. , 2000, Annual review of genomics and human genetics.

[8]  W. E. Ritter AS TO THE CAUSES OF EVOLUTION. , 1923, Science.

[9]  V. Nicaud,et al.  Interaction of the common apolipoprotein C-III (APOC3 -482C > T) and hepatic lipase (LIPC -514C > T) promoter variants affects glucose tolerance in young adults. European Atherosclerosis Research Study II (EARS-II). , 2001, Annals of human genetics.

[10]  F. Pasquier,et al.  A new polymorphism in the APOE promoter associated with risk of developing Alzheimer's disease. , 1998, Human molecular genetics.

[11]  P Bork,et al.  Individual variation in protein-coding sequences of human genome. , 2000, Advances in protein chemistry.

[12]  D. Botstein,et al.  The transcriptional program in the response of human fibroblasts to serum. , 1999, Science.

[13]  P. Weissberg,et al.  A Polymorphism of the Human Matrix γ-Carboxyglutamic Acid Protein Promoter Alters Binding of an Activating Protein-1 Complex and Is Associated with Altered Transcription and Serum Levels* , 2001, The Journal of Biological Chemistry.

[14]  D. Labie,et al.  Common haplotype dependency of high G gamma-globin gene expression and high Hb F levels in beta-thalassemia and sickle cell anemia patients. , 1985, Proceedings of the National Academy of Sciences of the United States of America.

[15]  G. Gyapay,et al.  A second-generation linkage map of the human genome , 1992, Nature.

[16]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[17]  D. Stern,et al.  Divergence of larval morphology between Drosophila sechellia and its sibling species caused by cis-regulatory evolution of ovo/shaven-baby. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[18]  H. Mcdevitt,et al.  Effects of a polymorphism in the human tumor necrosis factor alpha promoter on transcriptional activation. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[19]  E. Lander,et al.  Characterization of single-nucleotide polymorphisms in coding regions of human genes , 1999 .

[20]  M. Slatkin,et al.  Natural selection and resistance to HIV , 2001, Nature.

[21]  D. Hartl,et al.  Principles of population genetics , 1981 .

[22]  S. Carroll,et al.  From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design , 2000 .

[23]  A Kumar,et al.  Role of C/A polymorphism at -20 on the expression of human angiotensinogen gene. , 1999, Hypertension.

[24]  D. Stern PERSPECTIVE: EVOLUTIONARY DEVELOPMENTAL BIOLOGY AND THE PROBLEM OF VARIATION , 2000, Evolution; international journal of organic evolution.

[25]  J. Neel Diabetes mellitus: a "thrifty" genotype rendered detrimental by "progress"? , 1962, American journal of human genetics.

[26]  J. Guardiola,et al.  Functional significance of polymorphism among MHC class II gene promoters. , 1996, Tissue antigens.

[27]  Camillo Ricordi,et al.  The insulin gene is transcribed in the human thymus and transcription levels correlate with allelic variation at the INS VNTR-IDDM2 susceptibility locus for type 1 diabetes , 1997, Nature Genetics.

[28]  Cécile Fizames,et al.  A comprehensive genetic map of the human genome based on 5,264 microsatellites , 1996, Nature.

[29]  F. Green,et al.  Cooperative Influence of Genetic Polymorphisms on Interleukin 6 Transcriptional Regulation* , 2000, The Journal of Biological Chemistry.

[30]  B. Morris,et al.  Influence of an inducible nitric oxide synthase promoter variant on clinical variables in patients with coronary artery disease. , 2001, Clinical science.

[31]  H. Nijhout,et al.  Developmental Models and Polygenic Characters , 1997, The American Naturalist.

[32]  D. Hartl,et al.  Manifold anomalies in gene expression in a vineyard isolate of Saccharomyces cerevisiae revealed by DNA microarray analysis. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Y. Kashi,et al.  Simple sequence repeats as a source of quantitative genetic variation. , 1997, Trends in genetics : TIG.

[34]  L. Jin,et al.  Conservation of human chromosome 13 polymorphic microsatellite (CA)n repeats in chimpanzees. , 1994, Genomics.

[35]  N. Gostling,et al.  From DNA to Diversity: Molecular Genetics and the Evolution of Animal Design , 2002, Heredity.

[36]  J. Josse,et al.  Quantitative trait loci underlying gene product variation: a novel perspective for analyzing regulation of genome expression. , 1994, Genetics.

[37]  W. Rutter,et al.  The minisatellite in the diabetes susceptibility locus IDDM2 regulates insulin transcription , 1995, Nature Genetics.

[38]  M. Batzer,et al.  Alu repeats and human disease. , 1999, Molecular genetics and metabolism.

[39]  Wen-Hsiung Li,et al.  Global patterns of human DNA sequence variation in a 10-kb region on chromosome 1. , 2001, Molecular biology and evolution.

[40]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[41]  J. Weber,et al.  Mutation of human short tandem repeats. , 1993, Human molecular genetics.

[42]  G. Schroth,et al.  Mapping Z-DNA in the human genome. Computer-aided mapping reveals a nonrandom distribution of potential Z-DNA-forming sequences in human genes. , 1992, The Journal of biological chemistry.

[43]  C. Polychronakos,et al.  The INS 5′ Variable Number of Tandem Repeats Is Associated with IGF2 Expression in Humans* , 1998, The Journal of Biological Chemistry.

[44]  E Rauch,et al.  An Allelic Variation in the Human Prodynorphin Gene Promoter Alters Stimulus‐Induced Expression , 2000, Journal of neurochemistry.

[45]  H. F. Nijhout,et al.  Nonlinear developmental processes as sources of dominance. , 2001, Genetics.

[46]  John Quackenbush,et al.  A nucleotide substitution in the promoter of human angiotensinogen is associated with essential hypertension and affects basal transcription in vitro. , 1997, The Journal of clinical investigation.

[47]  A. Verhoeven,et al.  Hepatic lipase promoter activity is reduced by the C-480T and G-216A substitutions present in the common LIPC gene variant, and is increased by Upstream Stimulatory Factor. , 2001, Atherosclerosis.

[48]  C. Karp,et al.  A Common Single Nucleotide Polymorphism in the CD14 Promoter Decreases the Affinity of Sp Protein Binding and Enhances Transcriptional Activity1 , 2001, The Journal of Immunology.

[49]  J. Gusella,et al.  A single nucleotide polymorphism in the matrix metalloproteinase-1 promoter creates an Ets binding site and augments transcription. , 1998, Cancer research.

[50]  J. Stephens,et al.  Haplotype Variation and Linkage Disequilibrium in 313 Human Genes , 2001, Science.

[51]  P. Kwok,et al.  Overlapping genomic sequences: a treasure trove of single-nucleotide polymorphisms. , 1998, Genome research.

[52]  T. Dobzhansky Genetics and the Origin of Species , 1937 .

[53]  Jürgen Brosius,et al.  Genomes were forged by massive bombardments with retroelements and retrosequences , 2004, Genetica.

[54]  A. Ogurtsov,et al.  Selective constraint in intergenic regions of human and mouse genomes. , 2001, Trends in genetics : TIG.

[55]  L. Cavalli-Sforza,et al.  High resolution of human evolutionary trees with polymorphic microsatellites , 1994, Nature.

[56]  L Tiret,et al.  Sequence diversity in 36 candidate genes for cardiovascular disorders. , 1999, American journal of human genetics.

[57]  C. Pigott Genetics and the Origin of Species , 1959, Nature.

[58]  R. Britten,et al.  Gene regulation for higher cells: a theory. , 1969, Science.

[59]  P. Schulte,et al.  Adaptive variation in lactate dehydrogenase-B gene expression: role of a stress-responsive regulatory element. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[60]  Matthew W. Pennington,et al.  Thirteen UDPglucuronosyltransferase genes are encoded at the human UGT1 gene complex locus. , 2001, Pharmacogenetics.

[61]  R. Lewontin The Apportionment of Human Diversity , 1972 .

[62]  M. Hammer,et al.  Global survey of genetic variation in CCR5, RANTES, and MIP-1α: Impact on the epidemiology of the HIV-1 pandemic , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[63]  P. Kwok,et al.  A high-density single-nucleotide polymorphism map of Xq25-q28. , 2000, Genomics.

[64]  P. Froguel,et al.  Promoter polymorphism T(-107)C of the paraoxonase PON1 gene is a risk factor for coronary heart disease in type 2 diabetic patients. , 2000, Diabetes.

[65]  D. Mccormick Sequence the Human Genome , 1986, Bio/Technology.

[66]  G. Nepom,et al.  Allelic variation in transcription modulates MHC class II expression and function. , 1999, Microbes and infection.

[67]  L Tiret,et al.  Extensive association analysis between the CETP gene and coronary heart disease phenotypes reveals several putative functional polymorphisms and gene‐environment interaction , 2000, Genetic epidemiology.

[68]  C. Laurie,et al.  Molecular dissection of a major gene effect on a quantitative trait: the level of alcohol dehydrogenase expression in Drosophila melanogaster. , 1996, Genetics.

[69]  S. P. Fodor,et al.  Evolutionarily conserved sequences on human chromosome 21. , 2001, Genome research.

[70]  J. Breslow,et al.  Common genetic variation in the promoter of the human apo CIII gene abolishes regulation by insulin and may contribute to hypertriglyceridemia. , 1995, The Journal of clinical investigation.

[71]  E. Beutler,et al.  Racial variability in the UDP-glucuronosyltransferase 1 (UGT1A1) promoter: a balanced polymorphism for regulation of bilirubin metabolism? , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[72]  A. Rich,et al.  A polymorphic dinucleotide repeat in the rat nucleolin gene forms Z-DNA and inhibits promoter activity , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[73]  A. Kimura,et al.  Transcriptional regulation of the human type I collagen alpha2 (COL1A2) gene by the combination of two dinucleotide repeats. , 1999, Gene.

[74]  R. Britten,et al.  Mobile elements inserted in the distant past have taken on important functions. , 1997, Gene.

[75]  S T Sherry,et al.  Reading between the LINEs: human genomic variation induced by LINE-1 retrotransposition. , 2000, Genome research.

[76]  N. Shen,et al.  Patterns of single-nucleotide polymorphisms in candidate genes for blood-pressure homeostasis , 1999, Nature Genetics.

[77]  J. Haldane,et al.  The Causes of Evolution , 1933 .

[78]  A. Di Rienzo,et al.  Detection of the signature of natural selection in humans: evidence from the Duffy blood group locus. , 2000, American journal of human genetics.

[79]  Mark McCarthy,et al.  Weighing in on diabetes risk , 1998, Nature Genetics.

[80]  D. Tautz Evolution of transcriptional regulation. , 2000, Current opinion in genetics & development.

[81]  H. Hamada,et al.  Enhanced gene expression by the poly(dT-dG).poly(dC-dA) sequence , 1984, Molecular and cellular biology.

[82]  C. Bogardus,et al.  A calpain-10 gene polymorphism is associated with reduced muscle mRNA levels and insulin resistance. , 2000, The Journal of clinical investigation.

[83]  S. Wölfl,et al.  Transcription of the human corticotropin-releasing hormone gene in NPLC cells is correlated with Z-DNA formation. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[84]  J. Epplen,et al.  Genomic simple repetitive DNAs are targets for differential binding of nuclear proteins , 1996, FEBS letters.

[85]  Y. Sasaguri,et al.  Shortened microsatellite d(CA)21 sequence down‐regulates promoter activity of matrix metalloproteinase 9 gene , 1999, FEBS letters.

[86]  M J Dolan,et al.  Race-specific HIV-1 disease-modifying effects associated with CCR5 haplotypes. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[87]  L. Naylor,et al.  d(TG)n.d(CA)n sequences upstream of the rat prolactin gene form Z-DNA and inhibit gene transcription. , 1990, Nucleic acids research.

[88]  Wen-Hsiung Li,et al.  Low nucleotide diversity in man. , 1991, Genetics.

[89]  S. Pääbo,et al.  Intra- and Interspecific Variation in Primate Gene Expression Patterns , 2002, Science.

[90]  D. Crawford,et al.  Evolutionary analysis of TATA-less proximal promoter function. , 1999, Molecular biology and evolution.

[91]  G. Ginsburg,et al.  Intestinal transcription and synthesis of apolipoprotein AI is regulated by five natural polymorphisms upstream of the apolipoprotein CIII gene. , 1997, The Journal of clinical investigation.

[92]  M. King,et al.  Evolution at two levels in humans and chimpanzees. , 1975, Science.

[93]  Russell D. Wolfinger,et al.  The contributions of sex, genotype and age to transcriptional variance in Drosophila melanogaster , 2001, Nature Genetics.