Differential Expression Analysis of RNA-seq Reads: Overview, Taxonomy, and Tools

Analysis of RNA-sequence (RNA-seq) data is widely used in transcriptomic studies and it has many applications. We review RNA-seq data analysis from RNA-seq reads to the results of differential expression analysis. In addition, we perform a descriptive comparison of tools used in each step of RNA-seq data analysis along with a discussion of important characteristics of these tools. A taxonomy of tools is also provided. A discussion of issues in quality control and visualization of RNA-seq data is also included along with useful tools. Finally, we provide some guidelines for the RNA-seq data analyst, along with research issues and challenges which should be addressed.

[1]  Lior Pachter,et al.  Near-optimal RNA-Seq quantification , 2015, ArXiv.

[2]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[3]  Helga Thorvaldsdóttir,et al.  Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration , 2012, Briefings Bioinform..

[4]  Harald Binder,et al.  Feasibility of sample size calculation for RNA‐seq studies , 2017, Briefings Bioinform..

[5]  Yan Guo,et al.  RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment , 2014, Cancer informatics.

[6]  Justin Guinney,et al.  GSVA: gene set variation analysis for microarray and RNA-Seq data , 2013, BMC Bioinformatics.

[7]  Robert A. Edwards,et al.  Quality control and preprocessing of metagenomic datasets , 2011, Bioinform..

[8]  Ju Han Kim,et al.  TRAPR: R Package for Statistical Analysis and Visualization of RNA-Seq Data , 2017, Genomics & informatics.

[9]  J. Aerts,et al.  SCENIC: Single-cell regulatory network inference and clustering , 2017, Nature Methods.

[10]  Nicolas Delhomme,et al.  easyRNASeq: a bioconductor package for processing RNA-Seq data , 2012, Bioinform..

[11]  Sanghyuk Lee,et al.  EMSAR: estimation of transcript abundance from RNA-seq data by mappability-based segmentation and reclustering , 2015, BMC Bioinformatics.

[12]  Steven J. M. Jones,et al.  De novo assembly and analysis of RNA-seq data , 2010, Nature Methods.

[13]  Fabian J. Theis,et al.  Reconstructing gene regulatory dynamics from high-dimensional single-cell snapshot data , 2015, Bioinform..

[14]  Aaron Diaz,et al.  SCell: integrated analysis of single-cell RNA-seq data , 2016, Bioinform..

[15]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[16]  Qi Zheng,et al.  AlignerBoost: A Generalized Software Toolkit for Boosting Next-Gen Sequencing Mapping Accuracy Using a Bayesian-Based Mapping Quality Framework , 2016, PLoS Comput. Biol..

[17]  Reinhard Guthke,et al.  Data- and knowledge-based modeling of gene regulatory networks: an update , 2015, EXCLI journal.

[18]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[19]  Qiang Hu,et al.  Gene set analysis controlling for length bias in RNA-seq experiments , 2017, BioData Mining.

[20]  Piero Carninci,et al.  DEIVA: a web application for interactive visual analysis of differential gene expression profiles , 2017, BMC Genomics.

[21]  Oscar Westesson,et al.  Visualizing next-generation sequencing data with JBrowse , 2013, Briefings Bioinform..

[22]  Daniel J. Gaffney,et al.  A survey of best practices for RNA-seq data analysis , 2016, Genome Biology.

[23]  J. Leek svaseq: removing batch effects and other unwanted noise from sequencing data , 2014, bioRxiv.

[24]  Nir Yosef,et al.  ImpulseDE: detection of differentially expressed genes in time series data using impulse models , 2016, Bioinform..

[25]  Patrick J. Biggs,et al.  SolexaQA: At-a-glance quality assessment of Illumina second-generation sequencing data , 2010, BMC Bioinformatics.

[26]  Yixing Han,et al.  Advanced Applications of RNA Sequencing and Challenges , 2015, Bioinformatics and biology insights.

[27]  Colin N. Dewey,et al.  RNA-Seq gene expression estimation with read mapping uncertainty , 2009, Bioinform..

[28]  Matthew Berriman,et al.  Artemis: an integrated platform for visualization and analysis of high-throughput sequence-based experimental data , 2011, Bioinform..

[29]  Yongsheng Bai,et al.  Evaluation of de novo transcriptome assemblies from RNA-Seq data , 2014, Genome Biology.

[30]  Siu-Ming Yiu,et al.  IDBA-tran: a more robust de novo de Bruijn graph assembler for transcriptomes with uneven expression levels , 2013, Bioinform..

[31]  Stephen Hartley,et al.  QoRTs: a comprehensive toolset for quality control and data processing of RNA-Seq experiments , 2015, BMC Bioinformatics.

[32]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[33]  Faraz Hach,et al.  ORMAN: Optimal resolution of ambiguous RNA-Seq multimappings in the presence of novel isoforms , 2014, Bioinform..

[34]  Guochang Wang,et al.  A Hypothesis Testing Based Method for Normalization and Differential Expression Analysis of RNA-Seq Data , 2017, PloS one.

[35]  A. Børresen-Dale,et al.  Identification of fusion genes in breast cancer by paired-end RNA-sequencing , 2011, Genome Biology.

[36]  Kamil Slowikowski,et al.  UBiT2: a client-side web-application for gene expression data analysis , 2017 .

[37]  Jean-Pierre A. Kocher,et al.  A new statistic for identifying batch effects in high-throughput genomic data that uses guided principal component analysis , 2013, Bioinform..

[38]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[39]  Jie Zhou,et al.  RNA-seq differential expression studies: more sequence or more replication? , 2014, Bioinform..

[40]  Dmitri D. Pervouchine,et al.  A benchmark for RNA-seq quantification pipelines , 2016, Genome Biology.

[41]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[42]  Derek Y. Chiang,et al.  MapSplice: Accurate mapping of RNA-seq reads for splice junction discovery , 2010, Nucleic acids research.

[43]  Matthew D. Young,et al.  From RNA-seq reads to differential expression results , 2010, Genome Biology.

[44]  Charity W. Law,et al.  voom: precision weights unlock linear model analysis tools for RNA-seq read counts , 2014, Genome Biology.

[45]  Sandrine Dudoit,et al.  Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments , 2010, BMC Bioinformatics.

[46]  Mukesh Jain,et al.  NGS QC Toolkit: A Toolkit for Quality Control of Next Generation Sequencing Data , 2012, PloS one.

[47]  Martin Vingron,et al.  Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels , 2012, Bioinform..

[48]  Yan Guo,et al.  Power and sample size calculations for high-throughput sequencing-based experiments , 2018, Briefings Bioinform..

[49]  J. Harrow,et al.  Assessment of transcript reconstruction methods for RNA-seq , 2013, Nature Methods.

[50]  Vinod Gopalan,et al.  Review of sequencing platforms and their applications in phaeochromocytoma and paragangliomas. , 2017, Critical reviews in oncology/hematology.

[51]  Daniel Petit,et al.  An integrative method to normalize RNA-Seq data , 2014, BMC Bioinformatics.

[52]  Charles Girardot,et al.  Je, a versatile suite to handle multiplexed NGS libraries with unique molecular identifiers , 2016, BMC Bioinformatics.

[53]  Xi Wang,et al.  Gene set enrichment analysis of RNA-Seq data: integrating differential expression and splicing , 2013, BMC Bioinformatics.

[54]  Gabor T. Marth,et al.  Scotty: a web tool for designing RNA-Seq experiments to measure differential gene expression , 2013, Bioinform..

[55]  Daniel Spies,et al.  Comparative analysis of differential gene expression tools for RNA sequencing time course data , 2017, Briefings Bioinform..

[56]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[57]  David Gomez-Cabrero,et al.  Data integration in the era of omics: current and future challenges , 2014, BMC Systems Biology.

[58]  Wei Li,et al.  RSeQC: quality control of RNA-seq experiments , 2012, Bioinform..

[59]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[60]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[61]  Ann E. Loraine,et al.  The Integrated Genome Browser: free software for distribution and exploration of genome-scale datasets , 2009, Bioinform..

[62]  Joaquín Dopazo,et al.  ATGC transcriptomics: a web-based application to integrate, explore and analyze de novo transcriptomic data , 2017, BMC Bioinformatics.

[63]  B. Frey,et al.  Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing , 2008, Nature Genetics.

[64]  Matthew E. Ritchie,et al.  limma powers differential expression analyses for RNA-sequencing and microarray studies , 2015, Nucleic acids research.

[65]  R. Marsh,et al.  Comparative analysis of de novo transcriptome assembly , 2013, Science China Life Sciences.

[66]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[67]  Vanessa M Kvam,et al.  A comparison of statistical methods for detecting differentially expressed genes from RNA-seq data. , 2012, American journal of botany.

[68]  Trude M Haug,et al.  Using normalization to resolve RNA-Seq biases caused by amplification from minimal input. , 2014, Physiological genomics.

[69]  Alexander G Williams,et al.  RNA‐seq Data: Challenges in and Recommendations for Experimental Design and Analysis , 2014, Current protocols in human genetics.

[70]  Thomas Rattei,et al.  NVT: a fast and simple tool for the assessment of RNA-seq normalization strategies , 2016, Bioinform..

[71]  Robert D. Finn,et al.  InterPro in 2011: new developments in the family and domain prediction database , 2011, Nucleic acids research.

[72]  Fabian J Theis,et al.  Decoding the Regulatory Network for Blood Development from Single-Cell Gene Expression Measurements , 2015, Nature Biotechnology.

[73]  Yang Yang,et al.  NGS-FC: A Next-Generation Sequencing Data Format Converter , 2018, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[74]  Jean YH Yang,et al.  Bioconductor: open software development for computational biology and bioinformatics , 2004, Genome Biology.

[75]  Yan Li,et al.  DEApp: an interactive web interface for differential expression analysis of next generation sequence data , 2017, Source Code for Biology and Medicine.

[76]  Colin N. Dewey,et al.  De novo transcript sequence reconstruction from RNA-seq using the Trinity platform for reference generation and analysis , 2013, Nature Protocols.

[77]  Israel Steinfeld,et al.  BMC Bioinformatics BioMed Central , 2008 .

[78]  Tieliu Shi,et al.  Overview of available methods for diverse RNA-Seq data analyses , 2011, Science China Life Sciences.

[79]  Xiaofeng Song,et al.  GFusion: an Effective Algorithm to Identify Fusion Genes from Cancer RNA-Seq Data , 2017, Scientific Reports.

[80]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[81]  Guy N. Brock,et al.  Power analysis for RNA-Seq differential expression studies , 2017, BMC Bioinformatics.

[82]  Kang Ning,et al.  Assessment of quality control approaches for metagenomic data analysis , 2014, Scientific Reports.

[83]  N. Friedman,et al.  Trinity : reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2016 .

[84]  Ana Conesa,et al.  ARSyN: a method for the identification and removal of systematic noise in multifactorial time course microarray experiments. , 2012, Biostatistics.

[85]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[86]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[87]  Steven N. Hart,et al.  Calculating Sample Size Estimates for RNA Sequencing Data , 2013, J. Comput. Biol..

[88]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[89]  Sandrine Dudoit,et al.  GC-Content Normalization for RNA-Seq Data , 2011, BMC Bioinformatics.

[90]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[91]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[92]  Mingxiang Teng,et al.  On the widespread and critical impact of systematic bias and batch effects in single-cell RNA-Seq data , 2015 .

[93]  Juan Miguel García-Gómez,et al.  BIOINFORMATICS APPLICATIONS NOTE Sequence analysis Manipulation of FASTQ data with Galaxy , 2005 .

[94]  M. Gerstein,et al.  RNA-Seq: a revolutionary tool for transcriptomics , 2009, Nature Reviews Genetics.

[95]  Qi Zheng,et al.  GOEAST: a web-based software toolkit for Gene Ontology enrichment analysis , 2008, Nucleic Acids Res..

[96]  A. Heger,et al.  UMI-tools: modeling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy , 2016, bioRxiv.

[97]  Fabian J Theis,et al.  Impulse model-based differential expression analysis of time course sequencing data , 2017, bioRxiv.

[98]  Thomas Nussbaumer,et al.  RNASeqExpressionBrowser - a web interface to browse and visualize high-throughput expression data , 2014, Bioinform..

[99]  Rob Patro,et al.  Salmon provides fast and bias-aware quantification of transcript expression , 2017, Nature Methods.

[100]  Mark D. Robinson,et al.  Moderated statistical tests for assessing differences in tag abundance , 2007, Bioinform..

[101]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[102]  Marcel Martin Cutadapt removes adapter sequences from high-throughput sequencing reads , 2011 .

[103]  W. J. Kent,et al.  BLAT--the BLAST-like alignment tool. , 2002, Genome research.

[104]  Cheng Li,et al.  Adjusting batch effects in microarray expression data using empirical Bayes methods. , 2007, Biostatistics.

[105]  A. Saliba,et al.  Single-cell RNA-seq: advances and future challenges , 2014, Nucleic acids research.

[106]  Matthew D. Young,et al.  Gene ontology analysis for RNA-seq: accounting for selection bias , 2010, Genome Biology.

[107]  Denis C. Bauer,et al.  A Comparative Study of Techniques for Differential Expression Analysis on RNA-Seq Data , 2014, bioRxiv.

[108]  J. Roach,et al.  A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome , 2017, BMC Microbiology.

[109]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[110]  Sophie Lamarre,et al.  Optimization of an RNA-Seq Differential Gene Expression Analysis Depending on Biological Replicate Number and Library Size , 2018, Front. Plant Sci..

[111]  Jiang Li,et al.  Large Scale Comparison of Gene Expression Levels by Microarrays and RNAseq Using TCGA Data , 2013, PloS one.

[112]  Liang Sun,et al.  WebGIVI: a web-based gene enrichment analysis and visualization tool , 2017, BMC Bioinformatics.

[113]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[114]  Alexander Goesmann,et al.  ReadXplorer 2—detailed read mapping analysis and visualization from one single source , 2016, Bioinform..

[115]  S. Kelly,et al.  TransRate: reference-free quality assessment of de novo transcriptome assemblies , 2015, bioRxiv.

[116]  Måns Magnusson,et al.  MultiQC: summarize analysis results for multiple tools and samples in a single report , 2016, Bioinform..

[117]  John C. Marioni,et al.  Effect of read-mapping biases on detecting allele-specific expression from RNA-sequencing data , 2009, Bioinform..

[118]  K. Hansen,et al.  Biases in Illumina transcriptome sequencing caused by random hexamer priming , 2010, Nucleic acids research.

[119]  Xiuzhen Huang,et al.  Bridger: a new framework for de novo transcriptome assembly using RNA-seq data , 2015, Genome Biology.

[120]  P. Tsonis,et al.  CADBURE: A generic tool to evaluate the performance of spliced aligners on RNA-Seq data , 2015, Scientific Reports.

[121]  Harald Binder,et al.  Transforming RNA-Seq Data to Improve the Performance of Prognostic Gene Signatures , 2014, PloS one.

[122]  Guangchuang Yu,et al.  clusterProfiler: an R package for comparing biological themes among gene clusters. , 2012, Omics : a journal of integrative biology.

[123]  S. Dudoit,et al.  Normalization of RNA-seq data using factor analysis of control genes or samples , 2014, Nature Biotechnology.

[124]  S. Teichmann,et al.  A practical guide to single-cell RNA-sequencing for biomedical research and clinical applications , 2017, Genome Medicine.

[125]  Cole Trapnell,et al.  TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions , 2013, Genome Biology.

[126]  Aleksandra A. Kolodziejczyk,et al.  Classification of low quality cells from single-cell RNA-seq data , 2016, Genome Biology.

[127]  Bertil Schmidt,et al.  Next-generation sequencing: big data meets high performance computing. , 2017, Drug discovery today.

[128]  Michael Gribskov,et al.  Comprehensive evaluation of de novo transcriptome assembly programs and their effects on differential gene expression analysis. , 2016, Bioinformatics.

[129]  Björn Usadel,et al.  Trimmomatic: a flexible trimmer for Illumina sequence data , 2014, Bioinform..

[130]  J. Harrow,et al.  Systematic evaluation of spliced alignment programs for RNA-seq data , 2013, Nature Methods.

[131]  Marie-Agnès Dillies,et al.  SARTools: A DESeq2- and EdgeR-Based R Pipeline for Comprehensive Differential Analysis of RNA-Seq Data , 2015, bioRxiv.

[132]  Laleh Haghverdi,et al.  Batch effects in single-cell RNA-sequencing data are corrected by matching mutual nearest neighbors , 2018, Nature Biotechnology.

[133]  Joshua T. Burdick,et al.  Common genetic variants account for differences in gene expression among ethnic groups , 2007, Nature Genetics.

[134]  Süleyman Cenk Sahinalp,et al.  deFuse: An Algorithm for Gene Fusion Discovery in Tumor RNA-Seq Data , 2011, PLoS Comput. Biol..

[135]  M. Gerstein,et al.  The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing , 2008, Science.

[136]  Nuno A. Fonseca,et al.  Tools for mapping high-throughput sequencing data , 2012, Bioinform..

[137]  Hao Wu,et al.  A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data , 2012, Biostatistics.

[138]  M. Markatou,et al.  Evaluation of Methods in Removing Batch Effects on RNA-seq Data , 2016 .

[139]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[140]  Raphael Gottardo,et al.  Orchestrating high-throughput genomic analysis with Bioconductor , 2015, Nature Methods.

[141]  Laura L. Elo,et al.  Comparison of software packages for detecting differential expression in RNA-seq studies , 2013, Briefings Bioinform..

[142]  Ying Zhang,et al.  Bioinformatics for RNA‐Seq Data Analysis , 2016 .

[143]  J. Rinn,et al.  Ab initio reconstruction of transcriptomes of pluripotent and lineage committed cells reveals gene structures of thousands of lincRNAs , 2010, Nature Biotechnology.

[144]  Ümit V. Çatalyürek,et al.  Benchmarking short sequence mapping tools , 2013, BMC Bioinformatics.

[145]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[146]  Jaak Vilo,et al.  g:Profiler—a web server for functional interpretation of gene lists (2011 update) , 2011, Nucleic Acids Res..

[147]  Chris Williams,et al.  RNA-SeQC: RNA-seq metrics for quality control and process optimization , 2012, Bioinform..

[148]  Anushya Muruganujan,et al.  Large-scale gene function analysis with the PANTHER classification system , 2013, Nature Protocols.

[149]  M. Daly,et al.  PGC-1α-responsive genes involved in oxidative phosphorylation are coordinately downregulated in human diabetes , 2003, Nature Genetics.

[150]  Albert Kriegner,et al.  Monitoring of Technical Variation in Quantitative High-Throughput Datasets , 2013, Cancer informatics.

[151]  Xun Xu,et al.  SOAPdenovo-Trans: de novo transcriptome assembly with short RNA-Seq reads , 2013, Bioinform..

[152]  Barbara Di Camillo,et al.  How to design a single-cell RNA-sequencing experiment: pitfalls, challenges and perspectives. , 2018, Briefings in bioinformatics.

[153]  Kathleen M Jagodnik,et al.  Massive mining of publicly available RNA-seq data from human and mouse , 2017, Nature Communications.

[154]  A. Chinnaiyan,et al.  Gene fusions associated with recurrent amplicons represent a class of passenger aberrations in breast cancer. , 2012, Neoplasia.

[155]  D. J. Wheeler,et al.  A Block-sorting Lossless Data Compression Algorithm , 1994 .

[156]  Juliana Costa-Silva,et al.  RNA-Seq differential expression analysis: An extended review and a software tool , 2017, PloS one.

[157]  Heng Li,et al.  A survey of sequence alignment algorithms for next-generation sequencing , 2010, Briefings Bioinform..

[158]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[159]  Pearlly Yan,et al.  Quality Control for RNA-Seq (QuaCRS): An Integrated Quality Control Pipeline , 2014, Cancer informatics.

[160]  David G Hendrickson,et al.  Differential analysis of gene regulation at transcript resolution with RNA-seq , 2012, Nature Biotechnology.

[161]  Zhong Wang,et al.  Next-generation transcriptome assembly , 2011, Nature Reviews Genetics.

[162]  João Pedro de Magalhães,et al.  Gene co-expression analysis for functional classification and gene–disease predictions , 2017, Briefings Bioinform..

[163]  Lingling An,et al.  A robust approach for identifying differentially abundant features in metagenomic samples , 2015, Bioinform..

[164]  Aaron R. Shifman,et al.  Cascade: an RNA-seq visualization tool for cancer genomics , 2016, BMC Genomics.

[165]  Min Chen,et al.  Comparing Statistical Methods for Constructing Large Scale Gene Networks , 2012, PloS one.

[166]  O. Kallioniemi,et al.  FusionCatcher – a tool for finding somatic fusion genes in paired-end RNA-sequencing data , 2014, bioRxiv.

[167]  S. Aerts,et al.  Mapping gene regulatory networks from single-cell omics data , 2018, Briefings in functional genomics.

[168]  Terrence S. Furey,et al.  GSAASeqSP: A Toolset for Gene Set Association Analysis of RNA-Seq Data , 2014, Scientific Reports.

[169]  J. Zyprych-Walczak,et al.  The Impact of Normalization Methods on RNA-Seq Data Analysis , 2015, BioMed research international.

[170]  Lee T. Sam,et al.  Transcriptome Sequencing to Detect Gene Fusions in Cancer , 2009, Nature.

[171]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[172]  Peng Jiang,et al.  Quality control of single-cell RNA-seq by SinQC , 2016, Bioinform..

[173]  Steven Xijin Ge,et al.  iDEP: an integrated web application for differential expression and pathway analysis of RNA-Seq data , 2017, BMC Bioinformatics.

[174]  Roderic Guigó,et al.  The GEM mapper: fast, accurate and versatile alignment by filtration , 2012, Nature Methods.

[175]  Liliana Lopez-Kleine,et al.  Challenges Analyzing RNA-Seq Gene Expression Data , 2016 .

[176]  Lovelace J. Luquette,et al.  Diverse Mechanisms of Somatic Structural Variations in Human Cancer Genomes , 2013, Cell.

[177]  Jun Wang,et al.  SOAPfuse: an algorithm for identifying fusion transcripts from paired-end RNA-Seq data , 2013, Genome Biology.

[178]  G. Sherlock,et al.  Rnnotator: an automated de novo transcriptome assembly pipeline from stranded RNA-Seq reads , 2010, BMC Genomics.

[179]  Ting Yu,et al.  BinPacker: Packing-Based De Novo Transcriptome Assembly from RNA-seq Data , 2016, PLoS Comput. Biol..

[180]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[181]  Amos Bairoch,et al.  Swiss-Prot: Juggling between evolution and stability , 2004, Briefings Bioinform..

[182]  Jun Wu,et al.  HTQC: a fast quality control toolkit for Illumina sequencing data , 2013, BMC Bioinformatics.

[183]  Brad T. Sherman,et al.  Bioinformatics enrichment tools: paths toward the comprehensive functional analysis of large gene lists , 2008, Nucleic acids research.

[184]  Adrian V. Lee,et al.  Comprehensive evaluation of fusion transcript detection algorithms and a meta-caller to combine top performing methods in paired-end RNA-seq data , 2015, Nucleic acids research.

[185]  Steven L Salzberg,et al.  Fast gapped-read alignment with Bowtie 2 , 2012, Nature Methods.

[186]  Andreas Heger,et al.  UMI-tools: Modelling sequencing errors in Unique Molecular Identifiers to improve quantification accuracy , 2016, bioRxiv.

[187]  Jing Chen,et al.  ToppGene Suite for gene list enrichment analysis and candidate gene prioritization , 2009, Nucleic Acids Res..

[188]  Connie R. Jimenez,et al.  An accurate paired sample test for count data , 2012, Bioinform..

[189]  Robert D. Finn,et al.  The Pfam protein families database: towards a more sustainable future , 2015, Nucleic Acids Res..

[190]  Christoph Ziegenhain,et al.  powsimR: Power analysis for bulk and single cell RNA-seq experiments , 2017, bioRxiv.

[191]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[192]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[193]  Lana X Garmire,et al.  Power analysis and sample size estimation for RNA-Seq differential expression , 2014, RNA.

[194]  N. Friedman,et al.  Trinity: reconstructing a full-length transcriptome without a genome from RNA-Seq data , 2011, Nature Biotechnology.

[195]  Perry Evans,et al.  RNA-seq 2G: online analysis of differential gene expression with comprehensive options of statistical methods , 2017, bioRxiv.

[196]  Steven J. M. Jones,et al.  Abyss: a Parallel Assembler for Short Read Sequence Data Material Supplemental Open Access , 2022 .

[197]  A. Oshlack,et al.  JAFFA: High sensitivity transcriptome-focused fusion gene detection , 2015, Genome Medicine.