Transcriptional and structural control of cell identity genes

Mammals contain a wide array of cell types with distinct functions, yet nearly all cell types have the same genomic DNA. How the genetic instructions in DNA are selectively interpreted by cells to specify various cellular functions is a fundamental question in biology. This thesis work describes two genome-wide studies designed to study how transcriptional control of gene expression programs defines cell identity. Recent studies suggest that a small number of transcription factors, called "master" transcription factors, dominate the control of gene expression programs. These master transcription factors and the transcriptional regulatory circuitry they produce, however, are not known for all cell types. Ectopic expression of these factors can, in principle, direct transdifferentiation of readily available cells into medically relevant cell types for applications in regenerative medicine. Limited knowledge of these factors is a roadblock to generation of many medically relevant cell types. Chapter 2 presents a study in which a novel computational approach was undertaken to generate an atlas of candidate master transcriptional factors for 100+ human tissue/cell types. The candidate master transcription factors in retinal pigment epithelial (RPE) cells were then used to guide the investigation of the regulatory circuitry of RPE cells and to reprogram human fibroblasts into functional RPE-like cells. Master transcription factors define cell-type-specific gene expression through binding to enhancer elements in the genome. These enhancer-bound transcription factors regulate genes by contacting target gene promoters via the formation of DNA loops. It is becoming increasingly clear that transcription factors operate and regulate gene expression within a larger three-dimensional (3D) chromatin architecture, but these structures and their functions are poorly understood. Chapter 3 presents a study in which Cohesin ChIA-PET data was generated to identify the local chromosomal structures at both active and repressed genes across the genome in embryonic stem cells. The results led to the discovery of functional insulated neighborhood structures that are formed by two CTCF interaction sites occupied by Cohesin. The integrity of these looped structures contributes to the transcriptional control of super-enhancer-driven active genes and repressed genes encoding lineage-specifying developmental regulators.

[1]  Robert Patro,et al.  Identification of alternative topological domains in chromatin , 2014, Algorithms for Molecular Biology.

[2]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[3]  Lee E. Edsall,et al.  A map of the cis-regulatory sequences in the mouse genome , 2012, Nature.

[4]  D. Price,et al.  Control of RNA Polymerase II Elongation Potential by a Novel Carboxyl-terminal Domain Kinase* , 1996, The Journal of Biological Chemistry.

[5]  Matthew T. Maurano,et al.  Widespread plasticity in CTCF occupancy linked to DNA methylation , 2012, Genome research.

[6]  Alexei A. Sharov,et al.  Pluripotency governed by Sox2 via regulation of Oct3/4 expression in mouse embryonic stem cells , 2007, Nature Cell Biology.

[7]  Paul Schedl,et al.  A position-effect assay for boundaries of higher order chromosomal domains , 1991, Cell.

[8]  Tom Misteli,et al.  Functional implications of genome topology , 2013, Nature Structural &Molecular Biology.

[9]  K. Zhao,et al.  Mapping of INS promoter interactions reveals its role in long-range regulation of SYT8 transcription , 2011, Nature Structural &Molecular Biology.

[10]  David A. Orlando,et al.  Master Transcription Factors and Mediator Establish Super-Enhancers at Key Cell Identity Genes , 2013, Cell.

[11]  A. Sandelin,et al.  Metazoan promoters: emerging characteristics and insights into transcriptional regulation , 2012, Nature Reviews Genetics.

[12]  David Levens,et al.  CTCF and cohesin cooperate to organize the 3D structure of the mammalian genome , 2014, Proceedings of the National Academy of Sciences.

[13]  C. Glass,et al.  The selection and function of cell type-specific enhancers , 2015, Nature Reviews Molecular Cell Biology.

[14]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[15]  Matteo Pellegrini,et al.  Long-range chromatin contacts in embryonic stem cells reveal a role for pluripotency factors and polycomb proteins in genome organization. , 2013, Cell stem cell.

[16]  I. Amit,et al.  Massively Parallel Single-Cell RNA-Seq for Marker-Free Decomposition of Tissues into Cell Types , 2014, Science.

[17]  Z. Weng,et al.  Strand-specific libraries for high throughput RNA sequencing (RNA-Seq) prepared without poly(A) selection , 2012, Silence.

[18]  Yun Zhu,et al.  The pluripotent genome in three dimensions is shaped around pluripotency factors , 2013, Nature.

[19]  D. Altshuler,et al.  A map of human genome variation from population-scale sequencing , 2010, Nature.

[20]  Karl Mechtler,et al.  Methylation of histone H3 lysine 9 creates a binding site for HP1 proteins , 2001, Nature.

[21]  Melissa J. Moore,et al.  Pre-mRNA Processing Reaches Back toTranscription and Ahead to Translation , 2009, Cell.

[22]  Manolis Kellis,et al.  conserved and associated with A / T-rich sequence genome interactions are highly − Constitutive nuclear lamina Material Supplemental , 2013 .

[23]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[24]  M. Sung,et al.  Overlapping Chromatin Remodeling Systems Collaborate Genome-wide at Dynamic Chromatin Transitions , 2013, Nature Structural &Molecular Biology.

[25]  Chee Seng Chan,et al.  CTCF-Mediated Functional Chromatin Interactome in Pluripotent Cells , 2011, Nature Genetics.

[26]  A. West,et al.  The Protein CTCF Is Required for the Enhancer Blocking Activity of Vertebrate Insulators , 1999, Cell.

[27]  M. Daly,et al.  Genetic and Epigenetic Fine-Mapping of Causal Autoimmune Disease Variants , 2014, Nature.

[28]  M. Levine Transcriptional Enhancers in Animal Development and Evolution , 2010, Current Biology.

[29]  R. Mann,et al.  Disentangling the many layers of eukaryotic transcriptional regulation. , 2012, Annual review of genetics.

[30]  R. Kornberg The molecular basis of eukaryotic transcription , 2007, Proceedings of the National Academy of Sciences.

[31]  Jennifer E. Phillips-Cremins,et al.  Chromatin insulators: linking genome organization to cellular function. , 2013, Molecular cell.

[32]  Michel Bellis,et al.  Chromosomal Distribution of PcG Proteins during Drosophila Development , 2006, PLoS biology.

[33]  W. Sung,et al.  Chromatin connectivity maps reveal dynamic promoter–enhancer long-range associations , 2013, Nature.

[34]  K. Zhao,et al.  Characterization of genome-wide enhancer-promoter interactions reveals co-expression of interacting genes and modes of higher order chromatin organization , 2012, Cell Research.

[35]  S. Linnarsson,et al.  Cell types in the mouse cortex and hippocampus revealed by single-cell RNA-seq , 2015, Science.

[36]  G. Abecasis,et al.  Transcriptome analysis and molecular signature of human retinal pigment epithelium , 2010, Human molecular genetics.

[37]  N. D. Clarke,et al.  Integration of External Signaling Pathways with the Core Transcriptional Network in Embryonic Stem Cells , 2008, Cell.

[38]  J. Keith Joung,et al.  Interactome Maps of Mouse Gene Regulatory Domains Reveal Basic Principles of Transcriptional Regulation , 2013, Cell.

[39]  R. Young,et al.  Transcriptional Regulation and Its Misregulation in Disease , 2013, Cell.

[40]  B. Oostra,et al.  A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. , 2003, Human molecular genetics.

[41]  I. Amit,et al.  Comprehensive mapping of long range interactions reveals folding principles of the human genome , 2011 .

[42]  R. Ghirlando,et al.  Chromatin boundaries and chromatin domains. , 2004, Cold Spring Harbor symposia on quantitative biology.

[43]  R. Roeder,et al.  Dynamic regulation of pol II transcription by the mammalian Mediator complex. , 2005, Trends in biochemical sciences.

[44]  J. Manley,et al.  The RNA polymerase II CTD coordinates transcription and RNA processing. , 2012, Genes & development.

[45]  J. Dekker,et al.  The long-range interaction landscape of gene promoters , 2012, Nature.

[46]  Megan F. Cole,et al.  Control of Developmental Regulators by Polycomb in Human Embryonic Stem Cells , 2006, Cell.

[47]  M. Gerstein,et al.  Variation in Transcription Factor Binding Among Humans , 2010, Science.

[48]  N. Galjart,et al.  Functional analysis of CTCF during mammalian limb development. , 2010, Developmental cell.

[49]  Job Dekker,et al.  Organization of the Mitotic Chromosome , 2013, Science.

[50]  D. Bentley,et al.  5'-Capping enzymes are targeted to pre-mRNA by binding to the phosphorylated carboxy-terminal domain of RNA polymerase II. , 1997, Genes & development.

[51]  D. Price,et al.  Control of formation of two distinct classes of RNA polymerase II elongation complexes , 1992, Molecular and cellular biology.

[52]  G. Kreiman,et al.  Widespread transcription at neuronal activity-regulated enhancers , 2010, Nature.

[53]  A. West,et al.  Antagonism between DNA hypermethylation and enhancer-blocking activity at the H19 DMD is uncovered by CpG mutations , 2004, Nature Genetics.

[54]  G. Crabtree,et al.  ATP-dependent chromatin remodeling: genetics, genomics and mechanisms , 2011, Cell Research.

[55]  J. V. Falvo,et al.  Structure and function of the interferon-beta enhanceosome. , 1998, Cold Spring Harbor symposia on quantitative biology.

[56]  M. Gerstein,et al.  The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing , 2008, Science.

[57]  W. Sung,et al.  ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing , 2010, Genome Biology.

[58]  T. Misteli Beyond the Sequence: Cellular Organization of Genome Function , 2011 .

[59]  Rolf Ohlsson,et al.  CTCF is conserved from Drosophila to humans and confers enhancer blocking of the Fab‐8 insulator , 2005, EMBO reports.

[60]  V. Corces,et al.  CTCF: an architectural protein bridging genome topology and function , 2014, Nature Reviews Genetics.

[61]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[62]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[63]  Rolf Ohlsson,et al.  CTCF binding at the H19 imprinting control region mediates maternally inherited higher-order chromatin conformation to restrict enhancer access to Igf2. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[64]  D. Reinberg,et al.  The nonphosphorylated form of RNA polymerase II preferentially associates with the preinitiation complex. , 1991, Proceedings of the National Academy of Sciences of the United States of America.

[65]  Catalin C. Barbacioru,et al.  Tracing the Derivation of Embryonic Stem Cells from the Inner Cell Mass by Single-Cell RNA-Seq Analysis , 2010, Cell stem cell.

[66]  E. Furlong,et al.  Transcription factors: from enhancer binding to developmental control , 2012, Nature Reviews Genetics.

[67]  D. Reinberg,et al.  Human general transcription factor IIH phosphorylates the C-terminal domain of RNA polymerase II , 1992, Nature.

[68]  David A. Orlando,et al.  Master Transcription Factors Determine Cell-Type-Specific Responses to TGF-β Signaling , 2011, Cell.

[69]  Xia Li,et al.  The expanded human disease network combining protein–protein interaction information , 2011, European Journal of Human Genetics.

[70]  D. Price,et al.  Purification of P-TEFb, a Transcription Factor Required for the Transition into Productive Elongation (*) , 1995, The Journal of Biological Chemistry.

[71]  V. Corces,et al.  CTCF: Master Weaver of the Genome , 2009, Cell.

[72]  B. Steensel,et al.  Genome-wide profiling of PRC1 and PRC2 Polycomb chromatin binding in Drosophila melanogaster , 2006, Nature Genetics.

[73]  A. Tanay,et al.  Three-Dimensional Folding and Functional Organization Principles of the Drosophila Genome , 2012, Cell.

[74]  Richard A Young,et al.  Control of the Embryonic Stem Cell State , 2011, Cell.

[75]  E. Liu,et al.  An Oestrogen Receptor α-bound Human Chromatin Interactome , 2009, Nature.

[76]  D. Odom,et al.  CTCF and Cohesin: Linking Gene Regulatory Elements with Their Targets , 2013, Cell.

[77]  Peter A. Jones,et al.  The fundamental role of epigenetic events in cancer , 2002, Nature Reviews Genetics.

[78]  Roger D Kornberg,et al.  Mediator and the mechanism of transcriptional activation. , 2005, Trends in biochemical sciences.

[79]  J. Lehoczky,et al.  Conserved expression domains for genes upstream and within the HoxA and HoxD clusters suggests a long‐range enhancer existed before cluster duplication , 2004, Evolution & development.

[80]  Flemming Topsøe,et al.  Jensen-Shannon divergence and Hilbert space embedding , 2004, International Symposium onInformation Theory, 2004. ISIT 2004. Proceedings..

[81]  Stuart H. Orkin,et al.  Chromatin Connections to Pluripotency and Cellular Reprogramming , 2011, Cell.

[82]  F. Alt,et al.  CTCF Binding Elements Mediate Control of V(D)J Recombination , 2011, Nature.

[83]  Jesse R. Dixon,et al.  Cohesin and CTCF differentially affect chromatin architecture and gene expression in human cells , 2013, Proceedings of the National Academy of Sciences.

[84]  R. Kornberg,et al.  Twenty-Five Years of the Nucleosome, Fundamental Particle of the Eukaryote Chromosome , 1999, Cell.

[85]  A. Oudenaarden,et al.  Validation of noise models for single-cell transcriptomics , 2014, Nature Methods.

[86]  Raymond K. Auerbach,et al.  Extensive Promoter-Centered Chromatin Interactions Provide a Topological Basis for Transcription Regulation , 2012, Cell.

[87]  J. Sedat,et al.  Spatial partitioning of the regulatory landscape of the X-inactivation centre , 2012, Nature.

[88]  Jesse R. Dixon,et al.  Topological Domains in Mammalian Genomes Identified by Analysis of Chromatin Interactions , 2012, Nature.

[89]  Nathaniel D. Heintzman,et al.  Histone modifications at human enhancers reflect global cell-type-specific gene expression , 2009, Nature.

[90]  T. Hashimoto,et al.  Master Transcription Factors for Nicotine Biosynthesis in Tobacco , 2011 .

[91]  Henriette O'Geen,et al.  Suz12 binds to silenced regions of the genome in a cell-type-specific manner. , 2006, Genome research.

[92]  David A. Orlando,et al.  Mediator and Cohesin Connect Gene Expression and Chromatin Architecture , 2010, Nature.

[93]  Richard Bourgon,et al.  Genome-wide analysis of Polycomb targets in Drosophila melanogaster , 2006, Nature Genetics.

[94]  Harold Weintraub,et al.  Transfection of a DNA locus that mediates the conversion of 10T1 2 fibroblasts to myoblasts , 1986, Cell.

[95]  M. Hammar,et al.  Combining Evidence of Preferential Gene-Tissue Relationships from Multiple Sources , 2013, PloS one.

[96]  B. Ren,et al.  The 3D genome in transcriptional regulation and pluripotency. , 2014, Cell stem cell.

[97]  R. Roeder,et al.  Transcriptional regulation and the role of diverse coactivators in animal cells , 2005, FEBS letters.

[98]  P. Neiman,et al.  CTCF, a conserved nuclear factor required for optimal transcriptional activity of the chicken c-myc gene, is an 11-Zn-finger protein differentially expressed in multiple forms , 1993, Molecular and cellular biology.

[99]  Shane J. Neph,et al.  Systematic Localization of Common Disease-Associated Variation in Regulatory DNA , 2012, Science.

[100]  Yijun Ruan,et al.  Chromatin Interaction Analysis with Paired-End Tag Sequencing (ChIA-PET) for Mapping Chromatin Interactions and Understanding Transcription Regulation , 2012, Journal of visualized experiments : JoVE.

[101]  Michael Y Tolstorukov,et al.  Nature and function of insulator protein binding sites in the Drosophila genome , 2012, Genome research.

[102]  Britta A. M. Bouwman,et al.  A Single Oncogenic Enhancer Rearrangement Causes Concomitant EVI1 and GATA2 Deregulation in Leukemia , 2014, Cell.