Curse: building expression atlases and co-expression networks from public RNA-Seq data

Summary Public RNA-Sequencing (RNA-Seq) datasets are a valuable resource for transcriptome analyses, but their accessibility is hindered by the imperfect quality and presentation of their metadata and by the complexity of processing raw sequencing data. The Curse suite was created to alleviate these problems. It consists of an online curation tool named Curse to efficiently build compendia of experiments hosted on the Sequence Read Archive, and a lightweight pipeline named Prose to download and process the RNA-Seq data into expression atlases and co-expression networks. Curse networks showed improved linking of functionally related genes compared to the state-of-the-art. Availability and implementation Curse, Prose, and their manuals are available at http://bioinformatics.psb.ugent.be/webtools/Curse/. Prose was implemented in Java. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  Nuno A. Fonseca,et al.  Expression Atlas: gene and protein expression across multiple studies and organisms , 2017, Nucleic Acids Res..

[2]  Catia Pesquita,et al.  Metrics for GO based protein semantic similarity: a systematic evaluation , 2008, BMC Bioinformatics.

[3]  A. Loraine,et al.  Genome‐wide characterization of differential transcript usage in Arabidopsis thaliana , 2017, The Plant journal : for cell and molecular biology.

[4]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[5]  Rasko Leinonen,et al.  The sequence read archive: explosive growth of sequencing data , 2011, Nucleic Acids Res..

[6]  B. Usadel,et al.  PlaNet: Combined Sequence and Expression Comparisons across Plant Networks Derived from Seven Species[W][OA] , 2011, Plant Cell.

[7]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[8]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[9]  J. de Magalhães,et al.  A comparison of human and mouse gene co-expression networks reveals conservation and divergence at the tissue, pathway and disease levels , 2015, BMC Evolutionary Biology.

[10]  Sebastian Proost,et al.  CoNekT: an open-source framework for comparative genomic and transcriptomic network analyses , 2018, bioRxiv.

[11]  Sebastian Proost,et al.  LSTrAP: efficiently combining RNA sequencing data into co-expression networks , 2017, BMC Bioinformatics.

[12]  David P. Leader,et al.  FlyAtlas 2: a new version of the Drosophila melanogaster expression atlas with RNA-Seq, miRNA-Seq and sex-specific data , 2017, Nucleic Acids Res..

[13]  Albin Sandelin,et al.  The Landscape of Isoform Switches in Human Cancers , 2017, Molecular Cancer Research.