CenhANCER: a comprehensive cancer enhancer database for primary tissues and cell lines

Abstract Enhancers, which are key tumorigenic factors with wide applications for subtyping, diagnosis and treatment of cancer, are attracting increasing attention in the cancer research. However, systematic analysis of cancer enhancers poses a challenge due to the lack of integrative data resources, especially those from tumor primary tissues. To provide a comprehensive enhancer profile across cancer types, we developed a cancer enhancer database CenhANCER by curating public resources including all the public H3K27ac ChIP-Seq data from 805 primary tissue samples and 671 cell line samples across 41 cancer types. In total, 57 029 408 typical enhancers, 978 411 super-enhancers and 226 726 enriched transcription factors were identified. We annotated the super-enhancers with chromatin accessibility regions, cancer expression quantitative trait loci (eQTLs), genotype-tissue expression eQTLs and genome-wide association study risk single nucleotide polymorphisms (SNPs) for further functional analysis. The identified enhancers were highly consistent with accessible chromatin regions in the corresponding cancer types, and all the 10 super-enhancer regions identified from one colorectal cancer study were recapitulated in our CenhANCER, both of which testified the high quality of our data. CenhANCER with high-quality cancer enhancer candidates and transcription factors that are potential therapeutic targets across multiple cancer types provides a credible resource for single cancer analysis and for comparative studies of various cancer types. Database URL http://cenhancer.chenzxlab.cn/

[1]  Lin Chen,et al.  Genome-wide profiling in colorectal cancer identifies PHF19 and TBC1D16 as oncogenic super enhancers , 2021, Nature Communications.

[2]  F. Zanini,et al.  Disruption of a GATA2, TAL1, ERG regulatory circuit promotes erythroid transition in healthy and leukemic stem cells. , 2021, Blood.

[3]  Ryan L. Collins,et al.  Genome-wide enhancer maps link risk variants to disease genes , 2021, Nature.

[4]  Manolis Kellis,et al.  Regulatory genomic circuitry of human disease loci by integrative epigenomics , 2021, Nature.

[5]  Jeffrey S. Morris,et al.  Chromatin state dynamics confers specific therapeutic strategies in enhancer subtypes of colorectal cancer , 2020, Gut.

[6]  Michael J. Purcaro,et al.  Expanded encyclopaedias of DNA elements in the human and mouse genomes , 2020, Nature.

[7]  Zhi Luo,et al.  pyMeSHSim: an integrative python package for biomedical named entity recognition, normalization, and comparison of MeSH terms , 2020, BMC Bioinformatics.

[8]  G. Raghava,et al.  CancerEnD: A database of cancer associated enhancers. , 2020, Genomics.

[9]  Dennis J. Hazelett,et al.  Non-coding somatic mutations converge on the PAX8 pathway in ovarian cancer , 2020, Nature Communications.

[10]  Chunquan Li,et al.  ENdb: a manually curated database of experimentally supported enhancers for human and mouse , 2019, Nucleic Acids Res..

[11]  Christopher D. Brown,et al.  The GTEx Consortium atlas of genetic regulatory effects across human tissues , 2019, Science.

[12]  Zhiping Weng,et al.  A curated benchmark of enhancer-gene interactions for evaluating enhancer-target gene prediction methods , 2019, Genome Biology.

[13]  E. Li,et al.  SEanalysis: a web tool for super-enhancer associated regulatory analysis , 2019, Nucleic Acids Res..

[14]  Sandy L. Klemm,et al.  Chromatin accessibility and the regulatory epigenome , 2019, Nature Reviews Genetics.

[15]  Xiaohui Shi,et al.  OncoBase: a platform for decoding regulatory somatic mutations in human cancers , 2018, Nucleic Acids Res..

[16]  Jian Zhang,et al.  SEdb: a comprehensive human super-enhancer database , 2018, Nucleic Acids Res..

[17]  Mauro A. A. Castro,et al.  The chromatin accessibility landscape of primary human cancers , 2018, Science.

[18]  Yubo Zhang,et al.  Enhancer and super‐enhancer: Positive regulators in gene transcription , 2018, Animal models and experimental medicine.

[19]  Jia Gu,et al.  fastp: an ultra-fast all-in-one FASTQ preprocessor , 2018, bioRxiv.

[20]  Xia Li,et al.  DiseaseEnhancer: a resource of human disease-associated enhancer catalog , 2017, Nucleic Acids Res..

[21]  Zhao Zhang,et al.  PancanQTL: systematic identification of cis-eQTLs and trans-eQTLs in 33 cancer types , 2017, Nucleic Acids Res..

[22]  A. Yashin,et al.  haploR: an R package for querying web-based annotation tools , 2017, F1000Research.

[23]  Helen E. Parkinson,et al.  The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog) , 2016, Nucleic Acids Res..

[24]  J. Taipale,et al.  The role of enhancers in cancer , 2016, Nature Reviews Cancer.

[25]  Gianluca Bontempi,et al.  TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data , 2015, Nucleic acids research.

[26]  P. Campbell,et al.  OncoCis: annotation of cis-regulatory mutations in cancer , 2014, Genome Biology.

[27]  David A. Orlando,et al.  Master Transcription Factors and Mediator Establish Super-Enhancers at Key Cell Identity Genes , 2013, Cell.

[28]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[29]  Hideaki Sugawara,et al.  The Sequence Read Archive , 2010, Nucleic Acids Res..

[30]  C. Glass,et al.  Simple combinations of lineage-determining transcription factors prime cis-regulatory elements required for macrophage and B cell identities. , 2010, Molecular cell.

[31]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[32]  Dennis B. Troup,et al.  NCBI GEO: mining millions of expression profiles—database and tools , 2004, Nucleic Acids Res..