dreamBase: DNA modification, RNA regulation and protein binding of expressed pseudogenes in human health and disease

Abstract Although thousands of pseudogenes have been annotated in the human genome, their transcriptional regulation, expression profiles and functional mechanisms are largely unknown. In this study, we developed dreamBase (http://rna.sysu.edu.cn/dreamBase) to facilitate the investigation of DNA modification, RNA regulation and protein binding of potential expressed pseudogenes from multidimensional high-throughput sequencing data. Based on ∼5500 ChIP-seq and DNase-seq datasets, we identified genome-wide binding profiles of various transcription-associated factors around pseudogene loci. By integrating ∼18 000 RNA-seq data, we analysed the expression profiles of pseudogenes and explored their co-expression patterns with their parent genes in 32 cancers and 31 normal tissues. By combining microRNA binding sites, we demonstrated complex post-transcriptional regulation networks involving 275 microRNAs and 1201 pseudogenes. We generated ceRNA networks to illustrate the crosstalk between pseudogenes and their parent genes through competitive binding of microRNAs. In addition, we studied transcriptome-wide interactions between RNA binding proteins (RBPs) and pseudogenes based on 458 CLIP-seq datasets. In conjunction with epitranscriptome sequencing data, we also mapped 1039 RNA modification sites onto 635 pseudogenes. This database will provide insights into the transcriptional regulation, expression, functions and mechanisms of pseudogenes as well as their roles in biological processes and diseases.

[1]  Anton J. Enright,et al.  Human MicroRNA Targets , 2004, PLoS biology.

[2]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[3]  R. Nicholls,et al.  The putatively functional Mkrn1-p1 pseudogene is neither expressed nor imprinted, nor does it regulate its source gene in trans , 2006, Proceedings of the National Academy of Sciences.

[4]  Shwu-Fan Ma,et al.  A transcribed pseudogene of MYLK promotes cell proliferation , 2011, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[5]  Mark Gerstein,et al.  Pseudogene.org: a comprehensive database and comparison platform for pseudogene annotation , 2006, Nucleic Acids Res..

[6]  Yuri Motorin,et al.  Detecting RNA modifications in the epitranscriptome: predict and validate , 2017, Nature Reviews Genetics.

[7]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[8]  L. Stein,et al.  JBrowse: a next-generation genome browser. , 2009, Genome research.

[9]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[10]  Ming Sun,et al.  The Pseudogene DUXAP8 Promotes Non-small-cell Lung Cancer Cell Proliferation and Invasion by Epigenetically Silencing EGR1 and RHOB. , 2017, Molecular therapy : the journal of the American Society of Gene Therapy.

[11]  Jianzhi Zhang,et al.  Nonneutral evolution of the transcribed pseudogene Makorin1-p1 in mice. , 2004, Molecular biology and evolution.

[12]  Jeannie T. Lee Molecular biology: Complicity of gene and pseudogene , 2003, Nature.

[13]  S. Dhanasekaran,et al.  Expressed Pseudogenes in the Transcriptional Landscape of Human Cancers , 2012, Cell.

[14]  R. Verhaak,et al.  The Pan-Cancer Analysis of Pseudogene Expression Reveals Biologically and Clinically Relevant Tumour Subtypes , 2014, Nature Communications.

[15]  Mark Gerstein,et al.  Genomics: Protein fossils live on as RNA , 2008, Nature.

[16]  Mark Gerstein,et al.  Comprehensive analysis of amino acid and nucleotide composition in eukaryotic genomes, comparing genes and pseudogenes. , 2002, Nucleic acids research.

[17]  Y. Cheng,et al.  Identification of antisense RNA transcripts from a human DNA topoisomerase I pseudogene. , 1992, Cancer research.

[18]  Atsushi Yoshiki,et al.  An expressed pseudogene regulates the messenger-RNA stability of its homologous coding gene , 2003, Nature.

[19]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[20]  Yang An,et al.  Pseudogenes regulate parental gene expression via ceRNA network , 2016, Journal of cellular and molecular medicine.

[21]  Bronwen L. Aken,et al.  GENCODE: The reference human genome annotation for The ENCODE Project , 2012, Genome research.

[22]  Joshua M. Stuart,et al.  The Cancer Genome Atlas Pan-Cancer analysis project , 2013, Nature Genetics.

[23]  P. Pandolfi,et al.  A ceRNA Hypothesis: The Rosetta Stone of a Hidden RNA Language? , 2011, Cell.

[24]  Hui Zhou,et al.  ChIPBase v2.0: decoding transcriptional regulatory networks of non-coding RNAs and protein-coding genes from ChIP-seq data , 2016, Nucleic Acids Res..

[25]  Raymond K. Auerbach,et al.  A User's Guide to the Encyclopedia of DNA Elements (ENCODE) , 2011, PLoS biology.

[26]  T. Glisovic,et al.  RNA‐binding proteins and post‐transcriptional gene regulation , 2008, FEBS letters.

[27]  Masaru Tomita,et al.  A new role for expressed pseudogenes as ncRNA: regulation of mRNA stability of its homologous coding gene , 2004, Journal of Molecular Medicine.

[28]  Cole Trapnell,et al.  Role of Rodent Secondary Motor Cortex in Value-based Action Selection Nih Public Access Author Manuscript , 2006 .

[29]  Beth Israel,et al.  Decision letter: Replication Study: A coding-independent function of gene and pseudogene mRNAs regulates tumour biology , 2010 .

[30]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[31]  Jie Wu,et al.  RMBase: a resource for decoding the landscape of RNA modifications from high-throughput sequencing data , 2015, Nucleic Acids Res..

[32]  Xuerui Yang,et al.  An Extensive MicroRNA-Mediated Network of RNA-RNA Interactions Regulates Established Oncogenic Pathways in Glioblastoma , 2011, Cell.

[33]  F. Ayala,et al.  Pseudogene-derived small interference RNAs regulate gene expression in African Trypanosoma brucei , 2011, Proceedings of the National Academy of Sciences.

[34]  O. K. Olstad,et al.  The human ortholog of the rodent testis-specific ABC transporter Abca17 is a ubiquitously expressed pseudogene (ABCA17P) and shares a common 5' end with ABCA3 , 2006, BMC Molecular Biology.

[35]  L. Maquat,et al.  mRNA–mRNA duplexes that auto-elicit Staufen1-mediated mRNA decay , 2013, Nature Structural &Molecular Biology.

[36]  Y. Sakaki,et al.  Endogenous siRNAs from naturally formed dsRNAs regulate transcripts in mouse oocytes , 2008, Nature.

[37]  Brian T. Lee,et al.  The UCSC Genome Browser database: 2015 update , 2014, Nucleic Acids Res..

[38]  M. Gerstein,et al.  The GENCODE pseudogene resource , 2012, Genome Biology.

[39]  M. Hellum,et al.  The human ABC transporter pseudogene family: Evidence for transcription and gene-pseudogene interference , 2008, BMC Genomics.

[40]  Gary D. Bader,et al.  Cytoscape.js: a graph theory library for visualisation and analysis , 2015, Bioinform..

[41]  A. Quinlan BEDTools: The Swiss‐Army Tool for Genome Feature Analysis , 2014, Current protocols in bioinformatics.

[42]  P. Park ChIP–seq: advantages and challenges of a maturing technology , 2009, Nature Reviews Genetics.

[43]  A. Brunetti,et al.  Transcriptional regulation of human insulin receptor gene by the high‐mobility group protein HMGI(Y) , 2001, FASEB journal : official publication of the Federation of American Societies for Experimental Biology.

[44]  Tae Kyun Kim,et al.  T test as a parametric statistic , 2015, Korean journal of anesthesiology.

[45]  A. Mighell,et al.  Vertebrate pseudogenes , 2000, FEBS letters.

[46]  Pier Paolo Pandolfi,et al.  Tenets of PTEN Tumor Suppression , 2008, Cell.

[47]  P. Pandolfi,et al.  Pseudogenes in Human Cancer , 2015, Front. Med..

[48]  Chengqi Yi,et al.  Epitranscriptome sequencing technologies: decoding RNA modifications , 2016, Nature Methods.

[49]  Oliver H. Tam,et al.  Pseudogene-derived small interfering RNAs regulate gene expression in mouse oocytes , 2008, Nature.

[50]  Jie Wu,et al.  deepBase v2.0: identification, expression, evolution and function of small RNAs, LncRNAs and circular RNAs from deep-sequencing data , 2015, Nucleic Acids Res..

[51]  Liang-Hu Qu,et al.  Pseudogenes are not pseudo any more , 2012, RNA biology.

[52]  Sarah C. Ayling,et al.  The Ensembl gene annotation system , 2016, Database J. Biol. Databases Curation.

[53]  Leonard Lipovich,et al.  Global Intersection of Long Non-Coding RNAs with Processed and Unprocessed Pseudogenes in the Human Genome , 2016, Front. Genet..

[54]  M. Gerstein,et al.  Digging for dead genes: an analysis of the characteristics of the pseudogene population in the Caenorhabditis elegans genome. , 2001, Nucleic acids research.

[55]  S. Gerstberger,et al.  A census of human RNA-binding proteins , 2014, Nature Reviews Genetics.

[56]  F. Ayala,et al.  Pseudogenes: are they "junk" or functional DNA? , 2003, Annual review of genetics.

[57]  M. O'Shea,et al.  Neuronal Expression of Neural Nitric Oxide Synthase (nNOS) Protein Is Suppressed by an Antisense RNA Transcribed from an NOS Pseudogene , 1999, The Journal of Neuroscience.

[58]  S. Liebhaber,et al.  Pseudogene-mediated posttranscriptional silencing of HMGA1 can result in insulin resistance and type 2 diabetes. , 2010, Nature communications.

[59]  Ellen T. Gelfand,et al.  The Genotype-Tissue Expression (GTEx) project , 2013, Nature Genetics.