Tracing the Evolution of Lineage-Specific Transcription Factor Binding Sites in a Birth-Death Framework

Changes in cis-regulatory element composition that result in novel patterns of gene expression are thought to be a major contributor to the evolution of lineage-specific traits. Although transcription factor binding events show substantial variation across species, most computational approaches to study regulatory elements focus primarily upon highly conserved sites, and rely heavily upon multiple sequence alignments. However, sequence conservation based approaches have limited ability to detect lineage-specific elements that could contribute to species-specific traits. In this paper, we describe a novel framework that utilizes a birth-death model to trace the evolution of lineage-specific binding sites without relying on detailed base-by-base cross-species alignments. Our model was applied to analyze the evolution of binding sites based on the ChIP-seq data for six transcription factors (GATA1, SOX2, CTCF, MYC, MAX, ETS1) along the lineage toward human after human-mouse common ancestor. We estimate that a substantial fraction of binding sites (∼58–79% for each factor) in humans have origins since the divergence with mouse. Over 15% of all binding sites are unique to hominids. Such elements are often enriched near genes associated with specific pathways, and harbor more common SNPs than older binding sites in the human genome. These results support the ability of our method to identify lineage-specific regulatory elements and help understand their roles in shaping variation in gene regulation across species.

[1]  Z. Weng,et al.  Functional analysis of transcription factor binding sites in human promoters , 2012, Genome Biology.

[2]  Albert J. Vilella,et al.  A high-resolution map of human evolutionary constraint using 29 mammals , 2011, Nature.

[3]  Albert J. Vilella,et al.  Insights into hominid evolution from the gorilla genome sequence , 2012, Nature.

[4]  M. Gerstein,et al.  Variation in Transcription Factor Binding Among Humans , 2010, Science.

[5]  A. Mortazavi,et al.  Genome-Wide Mapping of in Vivo Protein-DNA Interactions , 2007, Science.

[6]  Jonathan K. Pritchard,et al.  The Functional Consequences of Variation in Transcription Factor Binding , 2013, PLoS genetics.

[7]  Saurabh Sinha,et al.  Evolution of Regulatory Sequences in 12 Drosophila Species , 2009, PLoS genetics.

[8]  W. Miller,et al.  Long human-mouse sequence alignments reveal novel regulatory elements: a reason to sequence the mouse genome. , 1997, Genome research.

[9]  G. Wray The evolutionary significance of cis-regulatory mutations , 2007, Nature Reviews Genetics.

[10]  D. Haussler,et al.  Aligning multiple genomic sequences with the threaded blockset aligner. , 2004, Genome research.

[11]  Klaudia Walter,et al.  Highly Conserved Non-Coding Sequences Are Associated with Vertebrate Development , 2004, PLoS biology.

[12]  D. Gifford,et al.  Tissue-specific transcriptional regulation has diverged significantly between human and mouse , 2007, Nature Genetics.

[13]  K. Pollard,et al.  Detection of nonneutral substitution rates on mammalian phylogenies. , 2010, Genome research.

[14]  J. A. Cavender,et al.  Quasi-stationary distributions of birth-and-death processes , 1978, Advances in Applied Probability.

[15]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[16]  D. Haussler,et al.  Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. , 2005, Genome research.

[17]  Mikael Bodén,et al.  MEME Suite: tools for motif discovery and searching , 2009, Nucleic Acids Res..

[18]  Manolis Kellis,et al.  Reliable prediction of regulator targets using 12 Drosophila genomes. , 2007, Genome research.

[19]  Matthew W. Hahn,et al.  The evolution of transcriptional regulation in eukaryotes. , 2003, Molecular biology and evolution.

[20]  Fred H. Gage,et al.  Cord blood-derived neuronal cells by ectopic expression of Sox2 and c-Myc , 2012, Proceedings of the National Academy of Sciences.

[21]  Hebing Chen,et al.  Comprehensive Identification and Annotation of Cell Type-Specific and Ubiquitous CTCF-Binding Sites in the Human Genome , 2012, PloS one.

[22]  Manolis Kellis,et al.  Evidence of Abundant Purifying Selection in Humans for Recently Acquired Regulatory Functions , 2012, Science.

[23]  Juan M. Vaquerizas,et al.  A census of human transcription factors: function, expression and evolution , 2009, Nature Reviews Genetics.

[24]  S. Orkin,et al.  Development of hematopoietic cells lacking transcription factor GATA-1. , 1995, Development.

[25]  D. Haussler,et al.  Cantly Associated with Increased Likelihood of References and Notes Supporting Online Material Materials and Methods Figs. S1 to S4 Tables S1 to S15 References Three Periods of Regulatory Innovation during Vertebrate Evolution , 2022 .

[26]  Daniel J. Blankenberg,et al.  28-way vertebrate alignment and conservation track in the UCSC Genome Browser. , 2007, Genome research.

[27]  Jian Ma,et al.  PSAR: Measuring Multiple Sequence Alignment Reliability by Probabilistic Sampling - (Extended Abstract) , 2011, RECOMB.

[28]  B. Birren,et al.  Sequencing and comparison of yeast species to identify genes and regulatory elements , 2003, Nature.

[29]  Naama Barkai,et al.  On the relation between promoter divergence and gene expression evolution , 2008, Molecular systems biology.

[30]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[31]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[32]  Philip Cayting,et al.  An encyclopedia of mouse DNA elements (Mouse ENCODE) , 2012, Genome Biology.

[33]  Naama Barkai,et al.  Inferring regulatory mechanisms from patterns of evolutionary divergence , 2011, Molecular systems biology.

[34]  Xiaohui Xie,et al.  MotifMap: a human genome-wide map of candidate regulatory motif sites , 2009, Bioinform..

[35]  Michael D. Wilson,et al.  Five-Vertebrate ChIP-seq Reveals the Evolutionary Dynamics of Transcription Factor Binding , 2010, Science.

[36]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[37]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[38]  E. Liu,et al.  Evolution of the mammalian transcription factor binding repertoire via transposable elements. , 2008, Genome research.

[39]  Alexandre Z. Caldeira,et al.  Uncertainty in homology inferences: assessing and improving genomic sequence alignment. , 2008, Genome research.

[40]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[41]  Y. Gilad,et al.  Comparative studies of gene expression and the evolution of gene regulation , 2012, Nature Reviews Genetics.

[42]  M. King,et al.  Evolution at two levels in humans and chimpanzees. , 1975, Science.

[43]  Lior Pachter,et al.  Binding Site Turnover Produces Pervasive Quantitative Changes in Transcription Factor Binding between Closely Related Drosophila Species , 2010, PLoS biology.

[44]  Paul Flicek,et al.  Extensive compensatory cis-trans regulation in the evolution of mouse gene expression , 2012, Genome research.

[45]  J. Felsenstein Maximum-likelihood estimation of evolutionary trees from continuous characters. , 1973, American journal of human genetics.

[46]  Lee E. Edsall,et al.  Human DNA methylomes at base resolution show widespread epigenomic differences , 2009, Nature.

[47]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[48]  A. Clark,et al.  Evolution of transcription factor binding sites in Mammalian gene regulatory regions: conservation and turnover. , 2002, Molecular biology and evolution.

[49]  Jessica Mariani,et al.  Impaired generation of mature neurons by neural stem cells from hypomorphic Sox2 mutants , 2008, Development.

[50]  Paul Flicek,et al.  Latent Regulatory Potential of Human-Specific Repetitive Elements , 2013, Molecular cell.

[51]  Cory Y. McLean,et al.  Human-specific loss of regulatory DNA and the evolution of human-specific traits , 2011, Nature.

[52]  Colin N. Dewey,et al.  Discovery of functional elements in 12 Drosophila genomes using evolutionary signatures , 2007, Nature.

[53]  Uwe Ohler,et al.  Modeling the Evolution of Regulatory Elements by Simultaneous Detection and Alignment with Phylogenetic Pair HMMs , 2010, PLoS Comput. Biol..

[54]  D. Haussler,et al.  Species-specific endogenous retroviruses shape the transcriptional network of the human tumor suppressor protein p53 , 2007, Proceedings of the National Academy of Sciences.

[55]  G. Bourque,et al.  Transposable elements have rewired the core regulatory network of human embryonic stem cells , 2010, Nature Genetics.

[56]  Eldon Emberly,et al.  Conservation of regulatory elements between two species of Drosophila , 2003, BMC Bioinformatics.

[57]  Xiaoyu Chen,et al.  Comparative assessment of methods for aligning multiple genome sequences , 2010, Nature Biotechnology.

[58]  D. Haussler,et al.  Article Identification and Characterization of Multi-Species Conserved Sequences , 2022 .

[59]  F. Robert,et al.  Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression , 2006 .

[60]  G. Bejerano,et al.  A "forward genomics" approach links genotype to phenotype using independent phenotypic losses among related species. , 2012, Cell reports.

[61]  K. Lindblad-Toh,et al.  Systematic discovery of regulatory motifs in human promoters and 3′ UTRs by comparison of several mammals , 2005, Nature.

[62]  A. Sandelin,et al.  Applied bioinformatics for the identification of regulatory elements , 2004, Nature Reviews Genetics.

[63]  Michael D. Wilson,et al.  Waves of Retrotransposon Expansion Remodel Genome Organization and CTCF Binding in Multiple Mammalian Lineages , 2012, Cell.

[64]  T. Mikkelsen,et al.  Systematic discovery of regulatory motifs in conserved regions of the human genome, including thousands of CTCF insulator sites , 2007, Proceedings of the National Academy of Sciences.

[65]  E. Davidson Genomic Regulatory Systems: Development and Evolution , 2005 .

[66]  David D. Pollock,et al.  SP Transcription Factor Paralogs and DNA-Binding Sites Coevolve and Adaptively Converge in Mammals and Birds , 2012, Genome biology and evolution.

[67]  Xin He,et al.  Alignment and Prediction of cis-Regulatory Modules Based on a Probabilistic Model of Evolution , 2009, PLoS Comput. Biol..

[68]  E. Birney,et al.  Heritable Individual-Specific and Allele-Specific Chromatin Signatures in Humans , 2010, Science.