Statistical learning quantifies transposable element-mediated cis-regulation

Background Transposable elements (TEs) have colonized the genomes of most metazoans, and many TE-embedded sequences function as cis-regulatory elements (CREs) for genes involved in a wide range of biological processes from early embryogenesis to innate immune responses. Because of their repetitive nature, TEs have the potential to form CRE platforms enabling the coordinated and genome-wide regulation of protein-coding genes by only a handful of trans-acting transcription factors (TFs). Results Here, we directly test this hypothesis through mathematical modeling and demonstrate that differences in expression at protein-coding genes alone are sufficient to estimate the magnitude and significance of TE-contributed cis-regulatory activities, even in contexts where TE-derived transcription fails to do so. We leverage hundreds of overexpression experiments and estimate that, overall, gene expression is influenced by TE-embedded CREs situated within approximately 200kb of promoters. Focusing on the cis-regulatory potential of TEs within the gene regulatory network of human embryonic stem cells, we find that pluripotency-specific and evolutionarily young TE subfamilies can be reactivated by TFs involved in post-implantation embryogenesis. Finally, we show that TE subfamilies can be split into truly regulatorily active versus inactive fractions based on additional information such as matched epigenomic data, observing that TF binding may better predict TE cis-regulatory activity than differences in histone marks. Conclusion Our results suggest that TE-embedded CREs contribute to gene regulation during and beyond gastrulation. On a methodological level, we provide a statistical tool that infers TE-dependent cis-regulation from RNA-seq data alone, thus facilitating the study of TEs in the next-generation sequencing era.

[1]  J. Wysocka,et al.  Roles of transposable elements in the regulation of mammalian transcription , 2022, Nature Reviews Molecular Cell Biology.

[2]  D. Trono,et al.  Primate-specific cis- and trans-regulators shape transcriptional networks during human development , 2021, bioRxiv.

[3]  Travis J. Wheeler,et al.  The Dfam community resource of transposable element families, sequence models, and genome annotations , 2020, Mobile DNA.

[4]  Ting Wang,et al.  Tissue-specific usage of transposable element-derived promoters in mouse development , 2020, Genome biology.

[5]  M. Dawson,et al.  Endogenous retroviruses are a source of enhancers with oncogenic potential in acute myeloid leukaemia , 2020, Nature Communications.

[6]  A. Sharov,et al.  Generation and Profiling of 2,135 Human ESC Lines for the Systematic Analyses of Cell States Perturbed by Inducing Single Transcription Factors. , 2020, Cell reports.

[7]  B. Deplancke,et al.  Primate-restricted KRAB zinc finger proteins and target retrotransposons control gene expression in human neurons , 2019, Science Advances.

[8]  Erica C. Pehrsson,et al.  The epigenomic landscape of transposable elements across normal human development and anatomy , 2019, Nature Communications.

[9]  M. Köttgen,et al.  Eomes and Brachyury control pluripotency exit and germ-layer segregation by changing the chromatin state , 2019, Nature Cell Biology.

[10]  R. Jaenisch,et al.  Hominoid-Specific Transposable Elements and KZFPs Facilitate Human Embryonic Genome Activation and Control Transcription in Naive Human ESCs , 2019, Cell stem cell.

[11]  W. Telford,et al.  The transcription factor c-Myb regulates CD8+ T cell stemness and antitumor immunity , 2018, Nature Immunology.

[12]  Christopher D. Brown,et al.  Transposable elements generate regulatory novelty in a tissue-specific fashion , 2018, BMC Genomics.

[13]  William A. Pastor,et al.  TFAP2C regulates transcription in human naive pluripotency by opening enhancers , 2018, Nature Cell Biology.

[14]  K. Kojima,et al.  Human transposable elements in Repbase: genomic footprints from fish to humans , 2018, Mobile DNA.

[15]  M. Ko,et al.  Neural differentiation of human embryonic stem cells induced by the transgene-mediated overexpression of single transcription factors. , 2017, Biochemical and biophysical research communications.

[16]  Ituro Inoue,et al.  Systematic identification and characterization of regulatory elements derived from human endogenous retroviruses , 2017, PLoS genetics.

[17]  Ting Wang,et al.  Functional cis-regulatory modules encoded by mouse-specific endogenous retrovirus , 2017, Nature Communications.

[18]  D. Trono,et al.  KRAB zinc-finger proteins contribute to the evolution of gene regulatory networks , 2017, Nature.

[19]  C. Feschotte,et al.  Regulatory activities of transposable elements: from conflicts to benefits , 2016, Nature Reviews Genetics.

[20]  R. Jaenisch,et al.  Molecular Criteria for Defining the Naive Human Pluripotent State , 2016, Cell Stem Cell.

[21]  V. Broccoli,et al.  MyT1 Counteracts the Neural Progenitor Program to Promote Vertebrate Neurogenesis , 2016, Cell reports.

[22]  C. Feschotte,et al.  Regulatory evolution of innate immunity through co-option of endogenous retroviruses , 2016, Science.

[23]  D. Trono,et al.  The developmental control of transposable elements and the evolution of higher species. , 2015, Annual review of cell and developmental biology.

[24]  Zhihai Ma,et al.  Widespread contribution of transposable elements to the innovation of gene regulatory networks , 2014, Genome research.

[25]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[26]  S. Yamanaka,et al.  Dynamic regulation of human endogenous retroviruses mediates factor-induced reprogramming and differentiation potential , 2014, Proceedings of the National Academy of Sciences.

[27]  Piotr J. Balwierz,et al.  ISMARA: automated modeling of genomic signals as a democracy of regulatory motifs , 2014, Genome research.

[28]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[29]  Luke A. Gilbert,et al.  CRISPR-Mediated Modular RNA-Guided Regulation of Transcription in Eukaryotes , 2013, Cell.

[30]  G. Bourque,et al.  The Majority of Primate-Specific Regulatory Sequences Are Derived from Transposable Elements , 2013, PLoS genetics.

[31]  J. Baker,et al.  Endogenous retroviruses function as species-specific enhancer elements in the placenta , 2013, Nature Genetics.

[32]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[33]  David Z. Chen,et al.  Architecture of the human regulatory network derived from ENCODE data , 2012, Nature.

[34]  W. Klein,et al.  Brn3a/Pou4f1 regulates dorsal root ganglion sensory neuron specification and axonal projection into the spinal cord. , 2012, Developmental biology.

[35]  Vincent J. Lynch,et al.  Transposon-mediated rewiring of gene regulatory networks contributed to the evolution of pregnancy in mammals , 2011, Nature Genetics.

[36]  K. Niakan,et al.  BRACHYURY and CDX2 Mediate BMP-Induced Differentiation of Human and Mouse Pluripotent Stem Cells into Embryonic and Extraembryonic Lineages , 2011, Cell stem cell.

[37]  G. Bourque,et al.  Transposable elements have rewired the core regulatory network of human embryonic stem cells , 2010, Nature Genetics.

[38]  Lee E. Edsall,et al.  Distinct epigenomic landscapes of pluripotent and lineage-committed human cells. , 2010, Cell stem cell.

[39]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[40]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[41]  M. Kanehisa,et al.  Characterization and evolutionary landscape of AmnSINE1 in Amniota genomes. , 2009, Gene.

[42]  Elaine Dzierzak,et al.  Runx1 is required for the endothelial to hematopoietic cell transition but not thereafter , 2009, Nature.

[43]  Michael Wolf,et al.  Multiple Testing , 2009 .

[44]  E. Liu,et al.  Evolution of the mammalian transcription factor binding repertoire via transposable elements. , 2008, Genome research.

[45]  Robert G. Ramsay,et al.  MYB function in normal and cancer cells , 2008, Nature Reviews Cancer.

[46]  C. Feschotte Transposable elements and the evolution of regulatory networks , 2008, Nature Reviews Genetics.

[47]  N. Saitou,et al.  Possible involvement of SINEs in mammalian-specific brain formation , 2008, Proceedings of the National Academy of Sciences.

[48]  Qiuhao Qu,et al.  Nuclear receptor TLX regulates cell cycle progression in neural stem cells of the developing brain. , 2008, Molecular endocrinology.

[49]  C. Feschotte,et al.  DNA transposons and the evolution of eukaryotic genomes. , 2007, Annual review of genetics.

[50]  M. DePamphilis,et al.  Transcription factor TEAD4 specifies the trophectoderm lineage at the beginning of mammalian development , 2007, Development.

[51]  Barrett C. Foat,et al.  Predictive modeling of genome-wide mRNA expression: from modules to molecules. , 2007, Annual review of biophysics and biomolecular structure.

[52]  G. Wray The evolutionary significance of cis-regulatory mutations , 2007, Nature Reviews Genetics.

[53]  Megan F. Cole,et al.  Core Transcriptional Regulatory Circuitry in Human Embryonic Stem Cells , 2005, Cell.

[54]  Janet Rossant,et al.  Cdx2 is required for correct cell fate specification and differentiation of trophectoderm in the mouse blastocyst , 2005, Development.

[55]  J. L. Gould,et al.  The Quarterly Review of Biology , 2005, The Quarterly Review of Biology.

[56]  T. Ohtsuka,et al.  Roles of the Basic Helix-Loop-Helix Genes Hes1 and Hes5 in Expansion of Neural Stem Cells of the Developing Brain* , 2001, The Journal of Biological Chemistry.

[57]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.

[58]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[59]  J. Molkentin The Zinc Finger-containing Transcription Factors GATA-4, -5, and -6 , 2000, The Journal of Biological Chemistry.

[60]  M. Horb,et al.  Tbx5 is essential for heart development. , 1999, Development.

[61]  M. King,et al.  Mutation in transcription factor POU4F3 associated with inherited progressive hearing loss in humans. , 1998, Science.

[62]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[63]  R. Britten,et al.  Repetitive and Non-Repetitive DNA Sequences and a Speculation on the Origins of Evolutionary Novelty , 1971, The Quarterly Review of Biology.