Logic programming to infer complex RNA expression patterns from RNA-seq data

To meet the increasing demand in the field, numerous long noncoding RNA (lncRNA) databases are available. Given many lncRNAs are specifically expressed in certain cell types and/or time-dependent manners, most lncRNA databases fall short of providing such profiles. We developed a strategy using logic programming to handle the complex organization of organs, their tissues and cell types as well as gender and developmental time points. To showcase this strategy, we introduce 'RenalDB' (http://renaldb.uni-frankfurt.de), a database providing expression profiles of RNAs in major organs focusing on kidney tissues and cells. RenalDB uses logic programming to describe complex anatomy, sample metadata and logical relationships defining expression, enrichment or specificity. We validated the content of RenalDB with biological experiments and functionally characterized two long intergenic noncoding RNAs: LOC440173 is important for cell growth or cell survival, whereas PAXIP1-AS1 is a regulator of cell death. We anticipate RenalDB will be used as a first step toward functional studies of lncRNAs in the kidney.

[1]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[2]  Chitta Baral,et al.  Logic Programming and Knowledge Representation , 1994, J. Log. Program..

[3]  Jing Xu,et al.  Renal Gene Expression Database (RGED): a relational database of gene expression profiles in kidney disease , 2014, Database J. Biol. Databases Curation.

[4]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[5]  Inna Dubchak,et al.  VISTA Enhancer Browser—a database of tissue-specific human enhancers , 2006, Nucleic Acids Res..

[6]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[7]  Shizuka Uchida,et al.  Deeply Dissecting Stemness: Making Sense to Non-Coding RNAs in Stem Cells , 2011, Stem Cell Reviews and Reports.

[8]  H. Shih,et al.  Urinary Xist is a potential biomarker for membranous nephropathy. , 2014, Biochemical and biophysical research communications.

[9]  Lennart Martens,et al.  An update on LNCipedia: a database for annotated human lncRNA sequences , 2014, Nucleic acids research.

[10]  Alan M. Moses,et al.  In vivo enhancer analysis of human conserved non-coding sequences , 2006, Nature.

[11]  W. Seeger,et al.  An integrated approach for the systematic identification and characterization of heart-enriched genes with unknown functions , 2009, BMC Genomics.

[12]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[13]  B. Faircloth,et al.  Primer3—new capabilities and interfaces , 2012, Nucleic acids research.

[14]  B. Herrmann,et al.  The tissue-specific transcriptomic landscape of the mid-gestational mouse embryo , 2014, Development.

[15]  Zhenjie Wu,et al.  RCCRT1 is correlated with prognosis and promotes cell migration and invasion in renal cell carcinoma. , 2014, Urology.

[16]  Hsien-Da Huang,et al.  lncRNAMap: A map of putative regulatory functions in the long non-coding transcriptome , 2014, Comput. Biol. Chem..

[17]  Brad T. Sherman,et al.  Systematic and integrative analysis of large gene lists using DAVID bioinformatics resources , 2008, Nature Protocols.

[18]  Anton J. Enright,et al.  Quantitative gene profiling of long noncoding RNAs with targeted RNA sequencing , 2015, Nature Methods.

[19]  Frank Klawonn,et al.  Neural fuzzy logic programming , 1992, IEEE Trans. Neural Networks.

[20]  Hui Zhou,et al.  ChIPBase: a database for decoding the transcriptional regulation of long non-coding RNA and microRNA genes from ChIP-Seq data , 2012, Nucleic Acids Res..

[21]  Chris Mungall,et al.  AmiGO: online access to ontology and annotation data , 2008, Bioinform..

[22]  P. S. Pine,et al.  An adaptable method using human mixed tissue ratiometric controls for benchmarking performance on gene expression microarrays in clinical laboratories , 2011, BMC biotechnology.

[23]  Rasko Leinonen,et al.  The sequence read archive: explosive growth of sequencing data , 2011, Nucleic Acids Res..

[24]  Shizuka Uchida,et al.  Gene Array Analyzer: alternative usage of gene arrays to study alternative splicing events , 2011, Nucleic acids research.

[25]  Ruiqiang Li,et al.  Single-cell RNA-Seq profiling of human preimplantation embryos and embryonic stem cells , 2013, Nature Structural &Molecular Biology.

[26]  Yin Zheng,et al.  Suppressed expression of long non-coding RNA HOTAIR inhibits proliferation and tumourigenicity of renal carcinoma cells , 2014, Tumor Biology.

[27]  Kenji Mizuguchi,et al.  Toxygates: interactive toxicity analysis on a hybrid microarray and linked data platform , 2013, Bioinform..

[28]  Marcel E. Dinger,et al.  lncRNAdb v2.0: expanding the reference database for functional long noncoding RNAs , 2014, Nucleic Acids Res..

[29]  Tim R. Mercer,et al.  NRED: a database of long noncoding RNA expression , 2008, Nucleic Acids Res..

[30]  J. Vandesompele,et al.  An update on LNCipedia: a database for annotated human lncRNA sequences , 2015, Nucleic Acids Res..

[31]  Loyal A. Goff,et al.  DeCoN: Genome-wide Analysis of In Vivo Transcriptional Dynamics during Pyramidal Neuron Fate Selection in Neocortex , 2015, Neuron.

[32]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[33]  Sven Rahmann,et al.  Snakemake--a scalable bioinformatics workflow engine. , 2012, Bioinformatics.

[34]  Shizuka Uchida,et al.  Noncoder: a web interface for exon array-based detection of long non-coding RNAs , 2012, Nucleic acids research.

[35]  David John,et al.  The identification and characterization of novel transcripts from RNA-seq data , 2016, Briefings Bioinform..

[36]  Wei Wu,et al.  NONCODE 2016: an informative and valuable data source of long non-coding RNAs , 2015, Nucleic Acids Res..

[37]  Jie Wu,et al.  deepBase v2.0: identification, expression, evolution and function of small RNAs, LncRNAs and circular RNAs from deep-sequencing data , 2015, Nucleic Acids Res..

[38]  Nuno A. Fonseca,et al.  Expression Atlas update—an integrated database of gene and protein expression in humans, animals and plants , 2015, Nucleic Acids Res..

[39]  Xia Li,et al.  Co-LncRNA: investigating the lncRNA combinatorial effects in GO annotations and KEGG pathways based on human RNA-Seq data , 2015, Database J. Biol. Databases Curation.

[40]  Sean R. Davis,et al.  NCBI GEO: archive for functional genomics data sets—update , 2012, Nucleic Acids Res..

[41]  Michael Morse,et al.  Spatiotemporal expression and transcriptional perturbations by long noncoding RNAs in the mouse brain , 2015, Proceedings of the National Academy of Sciences.

[42]  Catalin C. Barbacioru,et al.  Tracing the Derivation of Embryonic Stem Cells from the Inner Cell Mass by Single-Cell RNA-Seq Analysis , 2010, Cell stem cell.

[43]  Lei Wang,et al.  ALDB: A Domestic-Animal Long Noncoding RNA Database , 2015, PloS one.

[44]  Rodrigo Lopez,et al.  Analysis Tool Web Services from the EMBL-EBI , 2013, Nucleic Acids Res..

[45]  David John,et al.  Resolving the problem of multiple accessions of the same transcript deposited across various public databases , 2016, Briefings Bioinform..

[46]  Pietro Liò,et al.  The BioMart community portal: an innovative alternative to large, centralized data repositories , 2015, Nucleic Acids Res..

[47]  A. A. Stepanenko,et al.  HEK293 in cell biology and cancer research: phenotype, karyotype, tumorigenicity, and stress-induced genome-phenotype evolution. , 2015, Gene.

[48]  Catalin C. Barbacioru,et al.  RNA-Seq analysis to capture the transcriptome landscape of a single cell , 2010, Nature Protocols.

[49]  Tom H. Pringle,et al.  The human genome browser at UCSC. , 2002, Genome research.

[50]  S. Dimmeler,et al.  Long Noncoding RNA MALAT1 Regulates Endothelial Cell Function and Vessel Growth , 2014, Circulation Research.

[51]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[52]  S. Dimmeler,et al.  ANGIOGENES: knowledge database for protein-coding and noncoding RNA genes in endothelial cells , 2016, Scientific Reports.

[53]  A. Zeiher,et al.  Identification and Characterization of Hypoxia-Regulated Endothelial Circular RNA. , 2015, Circulation research.

[54]  Philipp Bucher,et al.  UCNEbase—a database of ultraconserved non-coding elements and genomic regulatory blocks , 2012, Nucleic Acids Res..

[55]  David John,et al.  C-It-Loci: a knowledge database for tissue-enriched loci , 2015, Bioinform..

[56]  Martin Reczko,et al.  DIANA-LncBase: experimentally verified and computationally predicted microRNA targets on long non-coding RNAs , 2012, Nucleic Acids Res..

[57]  J. Lomasney,et al.  Imprinted mesodermal specific transcript (MEST) and H19 genes in renal development and diabetes. , 2003, Kidney international.

[58]  D. Fliser,et al.  Circulating long noncoding RNATapSaki is a predictor of mortality in critically ill patients with acute kidney injury. , 2014, Clinical chemistry.

[59]  Manolis Kellis,et al.  The tissue-specific lncRNA Fendrr is an essential regulator of heart and body wall development in the mouse. , 2013, Developmental cell.

[60]  S. Sunkin,et al.  Specific expression of long noncoding RNAs in the mouse brain , 2008, Proceedings of the National Academy of Sciences.

[61]  Paul L. Roebuck,et al.  TANRIC: An Interactive Open Platform to Explore the Function of lncRNAs in Cancer. , 2015, Cancer research.

[62]  D. Haussler,et al.  Ultraconserved Elements in the Human Genome , 2004, Science.

[63]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[64]  J. Rinn,et al.  Diverse Phenotypes and Specific Transcription Patterns in Twenty Mouse Lines with Ablated LincRNAs , 2015, PloS one.

[65]  Jun Yu,et al.  MTD: a mammalian transcriptomic database to explore gene expression and regulation , 2016, Briefings Bioinform..

[66]  Sanghyuk Lee,et al.  lncRNAtor: a comprehensive resource for functional investigation of long non-coding RNAs , 2014, Bioinform..

[67]  Xueqing Yu,et al.  Long Noncoding RNA Arid2-IR Is a Novel Therapeutic Target for Renal Inflammation. , 2015, Molecular therapy : the journal of the American Society of Gene Therapy.

[68]  Yadong Wang,et al.  TF2LncRNA: Identifying Common Transcription Factors for a List of lncRNA Genes from ChIP-Seq Data , 2014, BioMed research international.

[69]  Marc J. Prindle,et al.  BRCT Domain-Containing Protein PTIP Is Essential for Progression through Mitosis , 2003, Molecular and Cellular Biology.

[70]  Michael Morse,et al.  Multiple knockout mouse models reveal lincRNAs are required for life and brain development , 2013, eLife.

[71]  Jiajie Peng,et al.  LncRNA2Function: a comprehensive resource for functional investigation of human lncRNAs based on RNA-seq data , 2015, BMC Genomics.

[72]  Cole Trapnell,et al.  Targeted RNA sequencing reveals the deep complexity of the human transcriptome , 2011, Nature Biotechnology.

[73]  D. Bartel,et al.  Conserved Function of lincRNAs in Vertebrate Embryonic Development despite Rapid Sequence Evolution , 2011, Cell.

[74]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[75]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[76]  S. Dimmeler,et al.  Long Noncoding RNAs in Cardiovascular Diseases , 2015, Circulation research.