lncRNAtor: a comprehensive resource for functional investigation of long non-coding RNAs

MOTIVATION A number of long non-coding RNAs (lncRNAs) have been identified by deep sequencing methods, but their molecular and cellular functions are known only for a limited number of lncRNAs. Current databases on lncRNAs are mostly for cataloging purpose without providing in-depth information required to infer functions. A comprehensive resource on lncRNA function is an immediate need. RESULTS We present a database for functional investigation of lncRNAs that encompasses annotation, sequence analysis, gene expression, protein binding and phylogenetic conservation. We have compiled lncRNAs for six species (human, mouse, zebrafish, fruit fly, worm and yeast) from ENSEMBL, HGNC, MGI and lncRNAdb. Each lncRNA was analyzed for coding potential and phylogenetic conservation in different lineages. Gene expression data of 208 RNA-Seq studies (4995 samples), collected from GEO, ENCODE, modENCODE and TCGA databases, were used to provide expression profiles in various tissues, diseases and developmental stages. Importantly, we analyzed RNA-Seq data to identify coexpressed mRNAs that would provide ample insights on lncRNA functions. The resulting gene list can be subject to enrichment analysis such as Gene Ontology or KEGG pathways. Furthermore, we compiled protein-lncRNA interactions by collecting and analyzing publicly available CLIP-seq or PAR-CLIP sequencing data. Finally, we explored evolutionarily conserved lncRNAs with correlated expression between human and six other organisms to identify functional lncRNAs. The whole contents are provided in a user-friendly web interface. AVAILABILITY AND IMPLEMENTATION lncRNAtor is available at http://lncrnator.ewha.ac.kr/. SUPPLEMENTARY INFORMATION Supplementary data are available at Bioinformatics online.

[1]  S. Diederichs,et al.  Gutschner T , Diederichs S . The hallmarks of cancer : a long non-coding RNA point of view . RNA Biol 9 : 703-719 , 2012 .

[2]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[3]  Hiromu Suzuki,et al.  Long noncoding RNA involvement in cancer , 2012, BMB reports.

[4]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[5]  John S. Mattick,et al.  lncRNAdb: a reference database for long noncoding RNAs , 2010, Nucleic Acids Res..

[6]  S. Bergmann,et al.  The evolution of gene expression levels in mammalian organs , 2011, Nature.

[7]  Yong Zhang,et al.  CPC: assess the protein-coding potential of transcripts using sequence features and support vector machine , 2007, Nucleic Acids Res..

[8]  Lennart Martens,et al.  LNCipedia: a database for annotated human lncRNA transcript sequences and structures , 2012, Nucleic Acids Res..

[9]  Tim R. Mercer,et al.  NRED: a database of long noncoding RNA expression , 2008, Nucleic Acids Res..

[10]  J. Greenblatt,et al.  RIPSeeker: a statistical package for identifying protein-associated transcripts from RIP-seq experiments , 2013, Nucleic acids research.

[11]  Davis J. McCarthy,et al.  Count-based differential expression analysis of RNA sequencing data using R and Bioconductor , 2013, Nature Protocols.

[12]  Gajendra P. S. Raghava,et al.  lncRNome: a comprehensive knowledgebase of human long noncoding RNAs , 2013, Database J. Biol. Databases Curation.

[13]  Wei Wu,et al.  NONCODEv4: exploring the world of long non-coding RNA genes , 2013, Nucleic Acids Res..

[14]  David G Hendrickson,et al.  Differential analysis of gene regulation at transcript resolution with RNA-seq , 2012, Nature Biotechnology.

[15]  K. Pollard,et al.  Detection of nonneutral substitution rates on mammalian phylogenies. , 2010, Genome research.

[16]  Cole Trapnell,et al.  Integrative annotation of human large intergenic noncoding RNAs reveals global properties and specific subclasses. , 2011, Genes & development.

[17]  Wei Wu,et al.  NPInter v2.0: an updated database of ncRNA interactions , 2013, Nucleic Acids Res..

[18]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[19]  L. Stanton,et al.  The long noncoding RNA RMST interacts with SOX2 to regulate neurogenesis. , 2013, Molecular cell.

[20]  David G. Knowles,et al.  The GENCODE v7 catalog of human long noncoding RNAs: Analysis of their gene structure, evolution, and expression , 2012, Genome research.

[21]  J. Foekens,et al.  CCAT2, a novel noncoding RNA mapping to 8q24, underlies metastatic progression and chromosomal instability in colon cancer , 2013, Genome research.

[22]  Hui Xiao,et al.  NONCODE v3.0: integrative annotation of long noncoding RNAs , 2011, Nucleic Acids Res..

[23]  D. Spector,et al.  The noncoding RNA MALAT1 is a critical regulator of the metastasis phenotype of lung cancer cells. , 2013, Cancer research.

[24]  Shuli Kang,et al.  Large-scale prediction of long non-coding RNA functions in a coding–non-coding gene co-expression network , 2011, Nucleic acids research.

[25]  Hui Zhou,et al.  starBase v2.0: decoding miRNA-ceRNA, miRNA-ncRNA and protein–RNA interaction networks from large-scale CLIP-Seq data , 2013, Nucleic Acids Res..

[26]  Philip Cayting,et al.  An encyclopedia of mouse DNA elements (Mouse ENCODE) , 2012, Genome Biology.