SEdb: a comprehensive human super-enhancer database

Abstract Super-enhancers are important for controlling and defining the expression of cell-specific genes. With research on human disease and biological processes, human H3K27ac ChIP-seq datasets are accumulating rapidly, creating the urgent need to collect and process these data comprehensively and efficiently. More importantly, many studies showed that super-enhancer-associated single nucleotide polymorphisms (SNPs) and transcription factors (TFs) strongly influence human disease and biological processes. Here, we developed a comprehensive human super-enhancer database (SEdb, http://www.licpathway.net/sedb) that aimed to provide a large number of available resources on human super-enhancers. The database was annotated with potential functions of super-enhancers in the gene regulation. The current version of SEdb documented a total of 331 601 super-enhancers from 542 samples. Especially, unlike existing super-enhancer databases, we manually curated and classified 410 available H3K27ac samples from >2000 ChIP-seq samples from NCBI GEO/SRA. Furthermore, SEdb provides detailed genetic and epigenetic annotation information on super-enhancers. Information includes common SNPs, motif changes, expression quantitative trait locus (eQTL), risk SNPs, transcription factor binding sites (TFBSs), CRISPR/Cas9 target sites and Dnase I hypersensitivity sites (DHSs) for in-depth analyses of super-enhancers. SEdb will help elucidate super-enhancer-related functions and find potential biological effects.

[1]  Tatiana A. Tatusova,et al.  Entrez Gene: gene-centered information at NCBI , 2004, Nucleic Acids Res..

[2]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[3]  Suzanna E Lewis,et al.  JBrowse: a dynamic web platform for genome visualization and analysis , 2016, Genome Biology.

[4]  Rasko Leinonen,et al.  The sequence read archive: explosive growth of sequencing data , 2011, Nucleic Acids Res..

[5]  Elizabeth M. Smigielski,et al.  dbSNP: the NCBI database of genetic variation , 2001, Nucleic Acids Res..

[6]  Paulo P. Amaral,et al.  Molecular Cell Previews RePlace Your BETs : The Dynamics of Super Enhancers , 2014 .

[7]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.

[8]  Manolis Kellis,et al.  ChromHMM: automating chromatin-state discovery and characterization , 2012, Nature Methods.

[9]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[10]  Bin Zhang,et al.  SEA: a super-enhancer archive , 2015, Nucleic Acids Res..

[11]  Gonçalo R. Abecasis,et al.  The variant call format and VCFtools , 2011, Bioinform..

[12]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[13]  Anton Nekrutenko,et al.  Integrating diverse databases into an unified analysis framework: a Galaxy approach , 2011, Database J. Biol. Databases Curation.

[14]  Sangsu Bae,et al.  Microhomology-based choice of Cas9 nuclease target sites , 2014, Nature Methods.

[15]  T. Mikkelsen,et al.  The NIH Roadmap Epigenomics Mapping Consortium , 2010, Nature Biotechnology.

[16]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[17]  D. Karolchik,et al.  The UCSC Genome Browser database: 2016 update , 2015, bioRxiv.

[18]  D. Gifford,et al.  Analysis of the mouse embryonic stem cell regulatory networks obtained by ChIP-chip and ChIP-PET , 2008, Genome Biology.

[19]  Andrew C. Wood,et al.  Genetic predisposition to neuroblastoma mediated by a LMO1 super-enhancer polymorphism , 2015, Nature.

[20]  David A. Orlando,et al.  Selective Inhibition of Tumor Oncogenes by Disruption of Super-Enhancers , 2013, Cell.

[21]  The Uniprot Consortium,et al.  UniProt: a hub for protein information , 2014, Nucleic Acids Res..

[22]  Wenjie Chen,et al.  GRASP v2.0: an update on the Genome-Wide Repository of Associations between SNPs and phenotypes , 2014, Nucleic Acids Res..

[23]  T. Meehan,et al.  An atlas of active enhancers across human cell types and tissues , 2014, Nature.

[24]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[25]  Alexander E. Kel,et al.  TRANSFAC® and its module TRANSCompel®: transcriptional gene regulation in eukaryotes , 2005, Nucleic Acids Res..

[26]  Anne Song,et al.  Therapeutic Targeting of Ependymoma as Informed by Oncogenic Enhancer Profiling , 2017, Nature.

[27]  Richard A Young,et al.  Models of human core transcriptional regulatory circuitries , 2016, Genome research.

[28]  Darren L. Smith,et al.  Superenhancer Analysis Defines Novel Epigenomic Subtypes of Non-APL AML, Including an RARα Dependency Targetable by SY-1425, a Potent and Selective RARα Agonist. , 2017, Cancer discovery.

[29]  Cory Y. McLean,et al.  GREAT improves functional interpretation of cis-regulatory regions , 2010, Nature Biotechnology.

[30]  David J. Arenillas,et al.  JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework , 2017, Nucleic acids research.

[31]  E. Campo,et al.  Genetic Predisposition to Chronic Lymphocytic Leukemia Is Mediated by a BMF Super-Enhancer Polymorphism , 2016, Cell reports.

[32]  Kevin Y. Yip,et al.  Reconstruction of enhancer–target networks in 935 samples of human primary cells, tissues and cell lines , 2017, Nature Genetics.

[33]  R. Young,et al.  Super-Enhancers in the Control of Cell Identity and Disease , 2013, Cell.

[34]  Wei Chen,et al.  ECharts: A declarative framework for rapid construction of web-based visualization , 2018, Vis. Informatics.

[35]  Peggy Hall,et al.  The NHGRI GWAS Catalog, a curated resource of SNP-trait associations , 2013, Nucleic Acids Res..

[36]  Chandler Zuo,et al.  atSNP: transcription factor binding affinity testing for regulatory SNP detection , 2015, Bioinform..

[37]  Melissa J. Landrum,et al.  RefSeq: an update on mammalian reference sequences , 2013, Nucleic Acids Res..

[38]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[39]  David A. Orlando,et al.  Master Transcription Factors and Mediator Establish Super-Enhancers at Key Cell Identity Genes , 2013, Cell.

[40]  M. Lupien,et al.  Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits , 2014, Genome research.

[41]  Manolis Kellis,et al.  HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants , 2011, Nucleic Acids Res..

[42]  David Haussler,et al.  The UCSC Genome Browser database: 2014 update , 2013, Nucleic Acids Res..

[43]  María Martín,et al.  UniProt: A hub for protein information , 2015 .

[44]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[45]  J. Kent,et al.  Evaluation of off-target and on-target scoring algorithms and integration into the guide RNA selection tool CRISPOR , 2016, Genome Biology.

[46]  Zhen Wang,et al.  HEDD: Human Enhancer Disease Database , 2017, Nucleic Acids Res..

[47]  Stephen C. J. Parker,et al.  Stretch-Enhancers Delineate Disease-Associated Regulatory Nodes in T Cells , 2014, Nature.

[48]  Ernest Fraenkel,et al.  Core transcriptional regulatory circuitry in human hepatocytes , 2006, Molecular systems biology.

[49]  Jun S. Liu,et al.  The Genotype-Tissue Expression (GTEx) pilot analysis: Multitissue gene regulation in humans , 2015, Science.

[50]  Yan Liu,et al.  Targeting transcriptional addictions in small cell lung cancer with a covalent CDK7 inhibitor. , 2014, Cancer cell.

[51]  Stephen C. J. Parker,et al.  Chromatin stretch enhancer states drive cell-specific gene regulation and harbor human disease risk variants , 2013, Proceedings of the National Academy of Sciences.

[52]  Aziz Khan,et al.  dbSUPER: a database of super-enhancers in mouse and human genome , 2015, bioRxiv.

[53]  R. Young,et al.  An oncogenic super-enhancer formed through somatic mutation of a noncoding intergenic element , 2014, Science.

[54]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[55]  Cesare Furlanello,et al.  A promoter-level mammalian expression atlas , 2015 .

[56]  Zhao Zhang,et al.  PancanQTL: systematic identification of cis-eQTLs and trans-eQTLs in 33 cancer types , 2017, Nucleic Acids Res..

[57]  C. Kai,et al.  CAGE: cap analysis of gene expression , 2006, Nature Methods.

[58]  Megan F. Cole,et al.  Core Transcriptional Regulatory Circuitry in Human Embryonic Stem Cells , 2005, Cell.

[59]  Maria C. Lecca,et al.  Neuroblastoma is composed of two super-enhancer-associated differentiation states , 2017, Nature Genetics.

[60]  Charles Y. Lin,et al.  Discovery and characterization of super-enhancer-associated dependencies in diffuse large B cell lymphoma. , 2013, Cancer cell.

[61]  Tsviya Olender,et al.  GeneCards Version 3: the human gene integrator , 2010, Database J. Biol. Databases Curation.