OverGeneDB: a database of 5′ end protein coding overlapping genes in human and mouse genomes

Abstract Gene overlap plays various regulatory functions on transcriptional and post-transcriptional levels. Most current studies focus on protein-coding genes overlapping with non-protein-coding counterparts, the so called natural antisense transcripts. Considerably less is known about the role of gene overlap in the case of two protein-coding genes. Here, we provide OverGeneDB, a database of human and mouse 5′ end protein-coding overlapping genes. The database contains 582 human and 113 mouse gene pairs that are transcribed using overlapping promoters in at least one analyzed library. Gene pairs were identified based on the analysis of the transcription start site (TSS) coordinates in 73 human and 10 mouse organs, tissues and cell lines. Beside TSS data, resources for 26 human lung adenocarcinoma cell lines also contain RNA-Seq and ChIP-Seq data for seven histone modifications and RNA Polymerase II activity. The collected data revealed that the overlap region is rarely conserved between the studied species and tissues. In ∼50% of the overlapping genes, transcription started explicitly in the overlap regions. In the remaining half of overlapping genes, transcription was initiated both from overlapping and non-overlapping TSSs. OverGeneDB is accessible at http://overgenedb.amu.edu.pl.

[1]  Michal Galdzicki,et al.  Mammalian overlapping genes: the comparative perspective. , 2004, Genome research.

[2]  D. Higgs,et al.  Transcription of antisense RNA leading to gene silencing and methylation as a novel cause of human genetic disease , 2003, Nature Genetics.

[3]  C. Wahlestedt,et al.  A Novel RNA Transcript with Antiapoptotic Function Is Silenced in Fragile X Syndrome , 2008, PloS one.

[4]  F. Rosenbauer,et al.  PU.1 expression is modulated by the balance of functional sense and antisense RNAs regulated by a shared cis-regulatory element. , 2008, Genes & development.

[5]  David J. Arenillas,et al.  JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles , 2015, Nucleic Acids Res..

[6]  H. Cui,et al.  Antisense RNAs and epigenetic regulation. , 2010, Epigenomics.

[7]  D. Adelson,et al.  Derivation of an endogenous small RNA from double-stranded Sox4 sense and natural antisense transcripts in the mouse brain. , 2016, Genomics.

[8]  Evan Bolton,et al.  Database resources of the National Center for Biotechnology Information , 2017, Nucleic Acids Res..

[9]  J. Sayer,et al.  Naturally occurring antisense RNA: function and mechanisms of action , 2009, Current opinion in nephrology and hypertension.

[10]  Kenta Nakai,et al.  DBTSS: DataBase of Transcriptional Start Sites progress report in 2012 , 2011, Nucleic Acids Res..

[11]  John H. Robinson,et al.  Contribution of natural antisense transcription to an endogenous siRNA signature in human cells , 2014, BMC Genomics.

[12]  A. Phillips,et al.  The Human Hyaluronan Synthase 2 (HAS2) Gene and Its Natural Antisense RNA Exhibit Coordinated Expression in the Renal Proximal Tubular Epithelial Cell♦ , 2011, The Journal of Biological Chemistry.

[13]  K. Morris,et al.  Bidirectional Transcription Directs Both Transcriptional Gene Activation and Suppression in Human Cells , 2008, PLoS genetics.

[14]  W. Rubinstein,et al.  Genome-wide analysis of antisense transcription with Affymetrix exon array , 2008, BMC Genomics.

[15]  B. Blumberg,et al.  Overlapping gene structure of human VLCAD and DLG4. , 2003, Gene.

[16]  Kenta Nakai,et al.  Genome-wide characterization of transcriptional start sites in humans by integrative transcriptome analysis. , 2011, Genome research.

[17]  Andrew B. Conley,et al.  Epigenetic regulation of human cis-natural antisense transcripts , 2012, Nucleic acids research.

[18]  Terrence S. Furey,et al.  The UCSC Table Browser data retrieval tool , 2004, Nucleic Acids Res..

[19]  M. Nishizawa,et al.  Regulation of inducible gene expression by natural antisense transcripts. , 2012, Frontiers in bioscience.

[20]  Y. Sakaki,et al.  Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes , 2008, Nature.

[21]  C. Wahlestedt,et al.  Knockdown of BACE1-AS Nonprotein-Coding Transcript Modulates Beta-Amyloid-Related Hippocampal Neurogenesis , 2011, International journal of Alzheimer's disease.

[22]  X. Shirley Liu,et al.  Genome-wide in silico identification and analysis of cis natural antisense transcripts (cis-NATs) in ten species , 2006, Nucleic acids research.

[23]  Hwan-Gue Cho,et al.  EVOG: a database for evolutionary analysis of overlapping genes , 2008, Nucleic Acids Res..

[24]  K. Struhl Transcriptional noise and the fidelity of initiation by RNA polymerase II , 2007, Nature Structural &Molecular Biology.

[25]  H. Gronemeyer,et al.  Human cells contain natural double-stranded RNAs with potential regulatory functions , 2014, Nature Structural &Molecular Biology.

[26]  I. Makałowska,et al.  Biological functions of natural antisense transcripts. , 2016, Acta biochimica Polonica.

[27]  Erez Y. Levanon,et al.  Widespread occurrence of antisense transcription in the human genome , 2003, Nature Biotechnology.

[28]  Guy Cochrane,et al.  European Nucleotide Archive in 2016 , 2016, Nucleic Acids Res..

[29]  Ge Tan,et al.  TFBSTools: an R/bioconductor package for transcription factor binding site analysis , 2016, Bioinform..

[30]  Wen J. Li,et al.  Reference sequence (RefSeq) database at NCBI: current status, taxonomic expansion, and functional annotation , 2015, Nucleic Acids Res..

[31]  D. Gautheret,et al.  Nonsense-Mediated Decay Restricts LncRNA Levels in Yeast Unless Blocked by Double-Stranded RNA Structure , 2016, Molecular cell.

[32]  V. Walker,et al.  Extraordinarily high density of unrelated genes showing overlapping and intraintronic transcription units. , 2000, Biochimica et biophysica acta.

[33]  Lukas Wagner,et al.  A Greedy Algorithm for Aligning DNA Sequences , 2000, J. Comput. Biol..

[34]  Shoshi Kikuchi,et al.  Antisense transcripts with rice full-length cDNAs , 2003, Genome Biology.

[35]  Keguo Li,et al.  Natural Antisense Transcript: A Concomitant Engagement with Protein-Coding Transcript , 2010, Oncotarget.

[36]  R. D. Gietz,et al.  Overlapping transcription units in the dopa decarboxylase region of Drosophila , 1986, Nature.

[37]  Ben Lehner,et al.  Antisense transcripts in the human genome. , 2002, Trends in genetics : TIG.

[38]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[39]  Galt P. Barber,et al.  BigWig and BigBed: enabling browsing of large distributed datasets , 2010, Bioinform..

[40]  T. Morgan,et al.  Expression of a noncoding RNA is elevated in Alzheimer's disease and drives rapid feed-forward regulation of β-secretase , 2008, Nature Medicine.

[41]  O. Khorkova,et al.  Natural antisense transcripts as therapeutic targets. , 2013, Drug discovery today. Therapeutic strategies.

[42]  J. Hoeijmakers,et al.  Conserved pattern of antisense overlapping transcription in the homologous human ERCC-1 and yeast RAD10 DNA repair gene regions , 1989, Molecular and cellular biology.

[43]  Aaron R. Quinlan,et al.  Bioinformatics Applications Note Genome Analysis Bedtools: a Flexible Suite of Utilities for Comparing Genomic Features , 2022 .

[44]  Jian Wang,et al.  Detecting novel low-abundant transcripts in Drosophila. , 2005, RNA.

[45]  Shang Gao,et al.  Genome-wide analysis of plant nat-siRNAs reveals insights into their distribution, biogenesis and function , 2012, Genome Biology.

[46]  Wolfgang Huber,et al.  A high-resolution map of transcription in the yeast genome. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[47]  Xiaoqiu Huang,et al.  Over 20% of human transcripts might form sense-antisense pairs. , 2004, Nucleic acids research.

[48]  C. Wahlestedt,et al.  Inhibition of natural antisense transcripts in vivo results in gene-specific transcriptional upregulation , 2012, Nature Biotechnology.

[49]  Yong Zhang,et al.  NATsDB: Natural Antisense Transcripts DataBase , 2006, Nucleic Acids Res..

[50]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[51]  M. Fried,et al.  A mouse locus at which transcription from both DNA strands produces mRNAs complementary at their 3′ ends , 1986, Nature.

[52]  J. Vaughn,et al.  RNA editing and regulation of Drosophila 4f-rnp expression by sas-10 antisense readthrough mRNA transcripts. , 2003, RNA.

[53]  Tim J. P. Hubbard,et al.  Dalliance: interactive genome viewing on the web , 2011, Bioinform..

[54]  K. Shearwin,et al.  Transcriptional interference--a crash course. , 2005, Trends in genetics : TIG.

[55]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[56]  K. Nieselt,et al.  Open reading frames provide a rich pool of potential natural antisense transcripts in fungal genomes , 2005, Nucleic acids research.

[57]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[58]  J. Micol,et al.  OTC and AUL1, two convergent and overlapping genes in the nuclear genome of Arabidopsis thaliana , 1999, FEBS letters.

[59]  Jie Wang,et al.  antiCODE: a natural sense-antisense transcripts database , 2007, BMC Bioinformatics.

[60]  Huan Wang,et al.  Prediction of trans-antisense transcripts in Arabidopsis thaliana , 2006, Genome Biology.

[61]  S. Batalov,et al.  Antisense Transcription in the Mammalian Transcriptome , 2005, Science.

[62]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[63]  C. Wahlestedt,et al.  Regulatory roles of natural antisense transcripts , 2009, Nature Reviews Molecular Cell Biology.

[64]  Kenta Nakai,et al.  DBTSS as an integrative platform for transcriptome, epigenome and genome sequence variation data , 2014, Nucleic Acids Res..

[65]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[66]  Jian Zhang,et al.  PlantNATsDB: a comprehensive database of plant natural antisense transcripts , 2011, Nucleic Acids Res..

[67]  Jay Shendure,et al.  Computational discovery of sense-antisense transcription in the human and mouse genomes , 2002, Genome Biology.

[68]  Gregory D. Schuler,et al.  Database resources of the National Center for Biotechnology Information: update , 2004, Nucleic acids research.

[69]  S. Henikoff,et al.  Gene within a gene: Nested Drosophila genes encode unrelated proteins on opposite DNA strands , 1986, Cell.

[70]  W. Pang,et al.  Sirt1 AS lncRNA interacts with its mRNA to inhibit muscle formation by attenuating function of miR-34a , 2016, Scientific Reports.

[71]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[72]  Izabela Makalowska,et al.  Overlapping genes in vertebrate genomes , 2005, Comput. Biol. Chem..

[73]  Huizhong Wang,et al.  NATpipe: an integrative pipeline for systematical discovery of natural antisense transcripts (NATs) and phase-distributed nat-siRNAs from de novo assembled transcriptomes , 2016, Scientific Reports.

[74]  A. Feinberg,et al.  Epigenetic silencing of tumour suppressor gene p15 by its antisense RNA , 2008, Nature.

[75]  D. Lindell,et al.  Antisense RNA protects mRNA from RNase E degradation by RNA–RNA duplex formation during phage infection , 2011, Nucleic acids research.

[76]  A. G. de Herreros,et al.  A natural antisense transcript regulates Zeb2/Sip1 gene expression during Snail1-induced epithelial-mesenchymal transition. , 2008, Genes & development.

[77]  T. Lassmann,et al.  The human PINK1 locus is regulated in vivo by a non-coding natural antisense RNA during modulation of mitochondrial function , 2007, BMC Genomics.

[78]  J. Mol,et al.  Regulation of plant gene expression by antisense RNA , 1990, FEBS letters.