Systematic identification and characterization of regulatory elements derived from human endogenous retroviruses

Human endogenous retroviruses (HERVs) and other long terminal repeat (LTR)-type retrotransposons (HERV/LTRs) have regulatory elements that possibly influence the transcription of host genes. We systematically identified and characterized these regulatory elements based on publicly available datasets of ChIP-Seq of 97 transcription factors (TFs) provided by ENCODE and Roadmap Epigenomics projects. We determined transcription factor-binding sites (TFBSs) using the ChIP-Seq datasets and identified TFBSs observed on HERV/LTR sequences (HERV-TFBSs). Overall, 794,972 HERV-TFBSs were identified. Subsequently, we identified “HERV/LTR-shared regulatory element (HSRE),” defined as a TF-binding motif in HERV-TFBSs, shared within a substantial fraction of a HERV/LTR type. HSREs could be an indication that the regulatory elements of HERV/LTRs are present before their insertions. We identified 2,201 HSREs, comprising specific associations of 354 HERV/LTRs and 84 TFs. Clustering analysis showed that HERV/LTRs can be grouped according to the TF binding patterns; HERV/LTR groups bounded to pluripotent TFs (e.g., SOX2, POU5F1, and NANOG), embryonic endoderm/mesendoderm TFs (e.g., GATA4/6, SOX17, and FOXA1/2), hematopoietic TFs (e.g., SPI1 (PU1), GATA1/2, and TAL1), and CTCF were identified. Regulatory elements of HERV/LTRs tended to locate nearby and/or interact three-dimensionally with the genes involved in immune responses, indicating that the regulatory elements play an important role in controlling the immune regulatory network. Further, we demonstrated subgroup-specific TF binding within LTR7, LTR5B, and LTR5_Hs, indicating that gains or losses of the regulatory elements occurred during genomic invasions of the HERV/LTRs. Finally, we constructed dbHERV-REs, an interactive database of HERV/LTR regulatory elements (http://herv-tfbs.com/). This study provides fundamental information in understanding the impact of HERV/LTRs on host transcription, and offers insights into the transcriptional modulation systems of HERV/LTRs and ancestral HERVs.

[1]  S. Hainsworth,et al.  A CRITICAL ASSESSMENT , 2014 .

[2]  Philip A. Ewels,et al.  Mapping long-range promoter contacts in human cells with high-resolution capture Hi-C , 2015, Nature Genetics.

[3]  Wing H Wong,et al.  The primate-specific noncoding RNA HPAT5 regulates pluripotency during human preimplantation development and nuclear reprogramming , 2015, Nature Genetics.

[4]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[5]  H. Ng,et al.  Dynamic transcription of distinct classes of endogenous retroviral elements marks specific populations of early human embryonic cells. , 2015, Cell stem cell.

[6]  International Human Genome Sequencing Consortium Initial sequencing and analysis of the human genome , 2001, Nature.

[7]  D. Haussler,et al.  A distal enhancer and an ultraconserved exon are derived from a novel retroposon , 2006, Nature.

[8]  C. Feschotte,et al.  Regulatory evolution of innate immunity through co-option of endogenous retroviruses , 2016, Science.

[9]  Martin Kraft,et al.  Expression of the Human Endogenous Retrovirus (HERV) Group HML-2/HERV-K Does Not Depend on Canonical Promoter Elements but Is Regulated by Transcription Factors Sp1 and Sp3 , 2011, Journal of Virology.

[10]  William Stafford Noble,et al.  FIMO: scanning for occurrences of a given motif , 2011, Bioinform..

[11]  O. Gascuel,et al.  New algorithms and methods to estimate maximum-likelihood phylogenies: assessing the performance of PhyML 3.0. , 2010, Systematic biology.

[12]  William Stafford Noble,et al.  Unsupervised pattern discovery in human chromatin structure through genomic segmentation , 2012, Nature Methods.

[13]  L. Hurst,et al.  Primate-specific endogenous retrovirus-driven transcription defines naive-like stem cells , 2014, Nature.

[14]  T. Otoi,et al.  Existence of Two Distinct Infectious Endogenous Retroviruses in Domestic Cats and Their Different Strategies for Adaptation to Transcriptional Regulation , 2016, Journal of Virology.

[15]  Julian R. E. Davis,et al.  Prolactin in man: a tale of two promoters , 2006, BioEssays : news and reviews in molecular, cellular and developmental biology.

[16]  J. Mccoy,et al.  Syncytin is a captive retroviral envelope protein involved in human placental morphogenesis , 2000, Nature.

[17]  Dixie L Mager,et al.  Transposable elements in mammals promote regulatory variation and diversification of genes with specialized functions. , 2003, Trends in genetics : TIG.

[18]  G. Bourque,et al.  The retrovirus HERVH is a long noncoding RNA required for human embryonic stem cell identity , 2014, Nature Structural &Molecular Biology.

[19]  The decline of human endogenous retroviruses: extinction and survival , 2015, Retrovirology.

[20]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[21]  T. Heidmann,et al.  Members of the SRY family regulate the human LINE retrotransposons. , 2000, Nucleic acids research.

[22]  David G. Knowles,et al.  Fast Computation and Applications of Genome Mappability , 2012, PloS one.

[23]  Zev N. Kronenberg,et al.  Transposable Elements Are Major Contributors to the Origin, Diversification, and Regulation of Vertebrate Long Noncoding RNAs , 2013, PLoS genetics.

[24]  Vladimir B. Bajic,et al.  HOCOMOCO: expansion and enhancement of the collection of transcription factor binding sites models , 2015, Nucleic Acids Res..

[25]  J. Coffin,et al.  Constructing primate phylogenies from ancient retrovirus sequences. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[26]  Michael J. Ziller,et al.  Transcription factor binding dynamics during human ESC differentiation , 2015, Nature.

[27]  L. Mirny,et al.  Exploring the three-dimensional organization of genomes: interpreting chromatin interaction data , 2013, Nature Reviews Genetics.

[28]  A. F. Scott,et al.  Promoter binding proteins of an active human L1 retrotransposon. , 1993, Biochemical and biophysical research communications.

[29]  Ying Jin,et al.  TEtranscripts: a package for including transposable elements in differential expression analysis of RNA-seq datasets , 2015, Bioinform..

[30]  Katherine E. Kyle,et al.  The histone methyltransferase SETDB1 represses endogenous and exogenous retroviruses in B lymphocytes , 2015, Proceedings of the National Academy of Sciences.

[31]  Jonathan M. Cairns,et al.  CHiCAGO: robust detection of DNA looping interactions in Capture Hi-C data , 2015, Genome Biology.

[32]  C. Feschotte,et al.  Endogenous viruses: insights into viral evolution and impact on host biology , 2012, Nature Reviews Genetics.

[33]  Kevin Struhl,et al.  NF-Y coassociates with FOS at promoters, enhancers, repetitive elements, and inactive chromatin regions, and is stereo-positioned with growth-controlling transcription factors , 2013, Genome research.

[34]  G. Bourque,et al.  The Majority of Primate-Specific Regulatory Sequences Are Derived from Transposable Elements , 2013, PLoS genetics.

[35]  Salam A. Assi,et al.  Dynamic Gene Regulatory Networks Drive Hematopoietic Specification and Differentiation , 2016, Developmental cell.

[36]  Ali Eroglu,et al.  Long-range function of an intergenic retrotransposon , 2010, Proceedings of the National Academy of Sciences.

[37]  M. Nakao,et al.  HTLV-1 inserts an ectopic CTCF-binding site into the human genome , 2015, Retrovirology.

[38]  Eugene W. Myers,et al.  Optimal alignments in linear space , 1988, Comput. Appl. Biosci..

[39]  Matko Bosnjak,et al.  REVIGO Summarizes and Visualizes Long Lists of Gene Ontology Terms , 2011, PloS one.

[40]  Cédric Feschotte,et al.  Ancient Transposable Elements Transformed the Uterine Regulatory Landscape and Transcriptome during the Evolution of Mammalian Pregnancy , 2015, Cell reports.

[41]  A. Quinlan BEDTools: The Swiss‐Army Tool for Genome Feature Analysis , 2014, Current protocols in bioinformatics.

[42]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[43]  M. Suntsova,et al.  Human-specific endogenous retroviral insert serves as an enhancer for the schizophrenia-linked gene PRODH , 2013, Proceedings of the National Academy of Sciences.

[44]  T. Heidmann,et al.  Genomewide screening for fusogenic human endogenous retrovirus envelopes identifies syncytin 2, a gene conserved on primate evolution , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Zhihai Ma,et al.  Widespread contribution of transposable elements to the innovation of gene regulatory networks , 2014, Genome research.

[46]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[47]  D. Tuan,et al.  The Long Terminal Repeat (LTR) of ERV-9 Human Endogenous Retrovirus Binds to NF-Y in the Assembly of an Active LTR Enhancer Complex NF-Y/MZF1/GATA-2* , 2005, Journal of Biological Chemistry.

[48]  Alexandros Stamatakis,et al.  RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies , 2014, Bioinform..

[49]  C. Feschotte,et al.  Regulatory activities of transposable elements: from conflicts to benefits , 2016, Nature Reviews Genetics.

[50]  R. E. Thayer,et al.  Binding of the ubiquitous nuclear transcription factor YY1 to a cis regulatory sequence in the human LINE-1 transposable element. , 1993, Human molecular genetics.

[51]  William Stafford Noble,et al.  Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors , 2012, Genome research.

[52]  M. Hattori,et al.  Identification of an internal cis-element essential for the human L1 transcription and a nuclear factor(s) binding to the element. , 1992, Nucleic acids research.

[53]  E. Ballestar,et al.  Dioxin receptor and SLUG transcription factors regulate the insulator activity of B1 SINE retrotransposons via an RNA polymerase switch. , 2011, Genome research.

[54]  Michael Q. Zhang,et al.  Integrative analysis of haplotype-resolved epigenomes across human tissues , 2015, Nature.

[55]  David J. Arenillas,et al.  JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles , 2015, Nucleic Acids Res..

[56]  C. Feschotte Transposable elements and the evolution of regulatory networks , 2008, Nature Reviews Genetics.

[57]  J. Coffin,et al.  Identification, characterization, and comparative genomic distribution of the HERV-K (HML-2) group of human endogenous retroviruses , 2011, Retrovirology.

[58]  L. Hurst,et al.  Pluripotency and the endogenous retrovirus HERVH: Conflict or serendipity? , 2016, BioEssays : news and reviews in molecular, cellular and developmental biology.

[59]  G. Bourque,et al.  Transposable elements have rewired the core regulatory network of human embryonic stem cells , 2010, Nature Genetics.

[60]  M. Martin,et al.  Characterization of a molecularly cloned retroviral sequence associated with Fv-4 resistance , 1985, Journal of virology.

[61]  T. Johansen,et al.  The promoter activity of long terminal repeats of the HERV-H family of human retrovirus-like elements is critically dependent on Sp1 family proteins interacting with a GC/GT box located immediately 3' to the TATA box , 1996, Journal of virology.

[62]  G. Glazko,et al.  Origin of a substantial fraction of human regulatory sequences from transposable elements. , 2003, Trends in genetics : TIG.

[63]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[64]  John H. Werren,et al.  The role of selfish genetic elements in eukaryotic evolution , 2001, Nature Reviews Genetics.

[65]  T. Ichisaka,et al.  Differentiation-defective phenotypes revealed by large-scale analyses of human pluripotent stem cells , 2013, Proceedings of the National Academy of Sciences.

[66]  Howard Y. Chang,et al.  Intrinsic retroviral reactivation in human preimplantation embryos and pluripotent cells , 2015, Nature.

[67]  Jonathan P. Stoye,et al.  Positional cloning of the mouse retrovirus restriction gene Fvl , 1996, Nature.

[68]  Jianrong Wang,et al.  MIR retrotransposon sequences provide insulators to the human genome , 2015, Proceedings of the National Academy of Sciences.

[69]  J. Baker,et al.  Endogenous retroviruses function as species-specific enhancer elements in the placenta , 2013, Nature Genetics.

[70]  S. Yamanaka,et al.  Dynamic regulation of human endogenous retroviruses mediates factor-induced reprogramming and differentiation potential , 2014, Proceedings of the National Academy of Sciences.

[71]  R. Douville,et al.  Endogenous retrovirus-K promoter: a landing strip for inflammatory transcription factors? , 2013, Retrovirology.

[72]  D. Mager,et al.  Endogenous retroviral LTRs as promoters for human genes: a critical assessment. , 2009, Gene.

[73]  K. Katoh,et al.  MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability , 2013, Molecular biology and evolution.

[74]  Sudhir Kumar,et al.  Tree of Life Reveals Clock-Like Speciation and Diversification , 2014, Molecular biology and evolution.

[75]  William Stafford Noble,et al.  Integrative annotation of chromatin elements from ENCODE data , 2012, Nucleic acids research.