PRADA: pipeline for RNA sequencing data analysis

SUMMARY Technological advances in high-throughput sequencing necessitate improved computational tools for processing and analyzing large-scale datasets in a systematic automated manner. For that purpose, we have developed PRADA (Pipeline for RNA-Sequencing Data Analysis), a flexible, modular and highly scalable software platform that provides many different types of information available by multifaceted analysis starting from raw paired-end RNA-seq data: gene expression levels, quality metrics, detection of unsupervised and supervised fusion transcripts, detection of intragenic fusion variants, homology scores and fusion frame classification. PRADA uses a dual-mapping strategy that increases sensitivity and refines the analytical endpoints. PRADA has been used extensively and successfully in the glioblastoma and renal clear cell projects of The Cancer Genome Atlas program. AVAILABILITY AND IMPLEMENTATION http://sourceforge.net/projects/prada/ CONTACT  gadgetz@broadinstitute.org or rverhaak@mdanderson.org SUPPLEMENTARY INFORMATION  Supplementary data are available at Bioinformatics online.

[1]  Chris Williams,et al.  RNA-SeQC: RNA-seq metrics for quality control and process optimization , 2012, Bioinform..

[2]  S. Salzberg,et al.  TopHat-Fusion: an algorithm for discovery of novel fusion transcripts , 2011, Genome Biology.

[3]  D. Brat,et al.  Transforming Fusions of FGFR and TACC Genes in Human Glioblastoma , 2012, Science.

[4]  The Cancer Genome Atlas Research Network COMPREHENSIVE MOLECULAR CHARACTERIZATION OF CLEAR CELL RENAL CELL CARCINOMA , 2013, Nature.

[5]  Zachary A. Szpiech,et al.  Genotype, haplotype and copy-number variation in worldwide human populations , 2008, Nature.

[6]  S. Gabriel,et al.  Integrated genomic analysis identifies clinically relevant subtypes of glioblastoma characterized by abnormalities in PDGFRA, IDH1, EGFR, and NF1. , 2010, Cancer cell.

[7]  Gonçalo R. Abecasis,et al.  The Sequence Alignment/Map format and SAMtools , 2009, Bioinform..

[8]  Ken Chen,et al.  A survey of intragenic breakpoints in glioblastoma identifies a distinct subset associated with poor survival. , 2013, Genes & development.

[9]  M. DePristo,et al.  The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. , 2010, Genome research.

[10]  Süleyman Cenk Sahinalp,et al.  deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data , 2011, PLoS Comput. Biol..

[11]  J. Maguire,et al.  Integrative analysis of the melanoma transcriptome. , 2010, Genome research.

[12]  Steven J. M. Jones,et al.  Comprehensive molecular characterization of clear cell renal cell carcinoma , 2013, Nature.

[13]  Richard Durbin,et al.  Sequence analysis Fast and accurate short read alignment with Burrows – Wheeler transform , 2009 .