RNA Sequencing Data: Hitchhiker's Guide to Expression Analysis

Gene expression is the fundamental level at which the results of various genetic and regulatory programs are observable. The measurement of transcriptome-wide gene expression has convincingly switched from microarrays to sequencing in a matter of years. RNA sequencing (RNA-seq) provides a quantitative and open system for profiling transcriptional outcomes on a large scale and therefore facilitates a large diversity of applications, including basic science studies, but also agricultural or clinical situations. In the past 10 years or so, much has been learned about the characteristics of the RNA-seq data sets, as well as the performance of the myriad of methods developed. In this review, we give an overview of the developments in RNA-seq data analysis, including experimental design, with an explicit focus on the quantification of gene expression and statistical approachesfor differential expression. We also highlight emerging data types, such as single-cell RNA-seq and gene expression profiling using long-read technologies.

[1]  Charlotte Soneson,et al.  A junction coverage compatibility score to quantify the reliability of transcript abundance estimates and annotation catalogs , 2018, Life Science Alliance.

[2]  R. Irizarry,et al.  Missing data and technical variability in single‐cell RNA‐sequencing experiments , 2018, Biostatistics.

[3]  W. J. Valente,et al.  Acquired cancer resistance to combination immunotherapy from transcriptional loss of class I HLA , 2018, Nature Communications.

[4]  S. Richardson,et al.  Correcting the Mean-Variance Dependency for Differential Variability Testing Using Single-Cell RNA Sequencing Data , 2018, Cell systems.

[5]  Charlotte Soneson,et al.  Swimming downstream: statistical analysis of differential transcript usage following Salmon quantification , 2018, F1000Research.

[6]  Yan Guo,et al.  RnaSeqSampleSize: real data based sample size estimation for RNA sequencing , 2018, BMC Bioinformatics.

[7]  A. Frankish,et al.  Towards a complete map of the human long non-coding RNA transcriptome , 2018, Nature Reviews Genetics.

[8]  Eric T. Wang,et al.  Mice with endogenous TDP‐43 mutations exhibit gain of splicing function and characteristics of amyotrophic lateral sclerosis , 2018, The EMBO journal.

[9]  Joseph G Ibrahim,et al.  Heavy-tailed prior distributions for sequence count data: removing the noise and preserving large differences , 2018, bioRxiv.

[10]  Miha Skalic,et al.  SUPPA2: fast, accurate, and uncertainty-aware differential splicing analysis across multiple conditions , 2016, Genome Biology.

[11]  Charlotte Soneson,et al.  Bias, robustness and scalability in single-cell differential expression analysis , 2018, Nature Methods.

[12]  Arul M. Chinnaiyan,et al.  Cancer transcriptome profiling at the juncture of clinical translation , 2017, Nature Reviews Genetics.

[13]  Charlotte Soneson,et al.  Observation weights unlock bulk RNA-seq tools for zero inflation and single-cell applications , 2018, Genome Biology.

[14]  Amir Giladi,et al.  Single-Cell Genomics: A Stepping Stone for Future Immunology Discoveries , 2018, Cell.

[15]  Wenqin Wang,et al.  Isoform Sequencing and State-of-Art Applications for Unravelling Complexity of Plant Transcriptomes , 2018, Genes.

[16]  Chun Jimmie Ye,et al.  Multiplexed droplet single-cell RNA-sequencing using natural genetic variation , 2017, Nature Biotechnology.

[17]  Wen-Lian Hsu,et al.  DART: a fast and accurate RNA-seq mapper with a partitioning strategy , 2017, Bioinform..

[18]  Xiao Wang,et al.  SparseIso: a novel Bayesian approach to identify alternatively spliced isoforms from RNA-seq data , 2017, Bioinform..

[19]  Heng Li,et al.  Minimap2: pairwise alignment for nucleotide sequences , 2017, Bioinform..

[20]  S. Teichmann,et al.  Exponential scaling of single-cell RNA-seq in the past decade , 2017, Nature Protocols.

[21]  Harald Binder,et al.  Feasibility of sample size calculation for RNA‐seq studies , 2017, Briefings Bioinform..

[22]  João Pedro de Magalhães,et al.  Gene co-expression analysis for functional classification and gene–disease predictions , 2017, Briefings Bioinform..

[23]  Daniel R. Garalde,et al.  Highly parallel direct RNA sequencing on an array of nanopores , 2016, Nature Methods.

[24]  Magnus Rattray,et al.  A Bayesian model selection approach for identifying differentially expressed transcripts from RNA sequencing data , 2014, Journal of the Royal Statistical Society. Series C, Applied statistics.

[25]  S. Dudoit,et al.  A general and flexible method for signal extraction from single-cell RNA-seq data , 2018, Nature Communications.

[26]  David A. Knowles,et al.  Annotation-free quantification of RNA splicing using LeafCutter , 2017, Nature Genetics.

[27]  Carl Kingsford,et al.  Accurate assembly of transcripts through phase-preserving graph decomposition , 2017, Nature Biotechnology.

[28]  Lior Pachter,et al.  Gene-level differential analysis at transcript-level resolution , 2017, Genome Biology.

[29]  Adam Godzik,et al.  The Functional Impact of Alternative Splicing in Cancer. , 2017, Cell reports.

[30]  Wei Wang,et al.  Fleximer: Accurate Quantification of RNA-Seq via Variable-Length k-mers , 2017, BCB.

[31]  M. Robinson,et al.  stageR: a general stage-wise method for controlling the gene-level false discovery rate in differential expression and differential transcript usage , 2017, Genome Biology.

[32]  Shanrong Zhao,et al.  Evaluation and comparison of computational tools for RNA-seq isoform quantification , 2017, BMC Genomics.

[33]  Pak Chung Sham,et al.  PacBio But Not Illumina Technology Can Achieve Fast, Accurate and Complete Closure of the High GC, Complex Burkholderia pseudomallei Two-Chromosome Genome , 2017, Front. Microbiol..

[34]  A. Godzik,et al.  The functional impact of alternative splicing in cancer , 2017, bioRxiv.

[35]  Corinne Da Silva,et al.  De novo Clustering Nanopore Long Reads of Transcriptomics Data by Gene , 2017 .

[36]  Fatemeh Almodaresi,et al.  Improved data-driven likelihood factorizations for transcript abundance estimation , 2017, Bioinform..

[37]  Sandrine Dudoit,et al.  Normalizing single-cell RNA sequencing data: challenges and opportunities , 2017, Nature Methods.

[38]  Geoffrey J. Barton,et al.  Identifying differential isoform abundance with RATs: a universal tool and a warning , 2017, bioRxiv.

[39]  Mark Akeson,et al.  Reading canonical and modified nucleotides in 16S ribosomal RNA using nanopore direct RNA sequencing , 2017, bioRxiv.

[40]  Jeffrey T Leek,et al.  Reproducible RNA-seq analysis using recount2 , 2017, Nature Biotechnology.

[41]  Rob Patro,et al.  Salmon provides fast and bias-aware quantification of transcript expression , 2017, Nature Methods.

[42]  Shalev Itzkovitz,et al.  Spatial transcriptomics: paving the way for tissue-level systems biology. , 2017, Current opinion in biotechnology.

[43]  D. Speiser,et al.  Simultaneous enumeration of cancer and immune cell types from bulk tumor gene expression data , 2017, bioRxiv.

[44]  Winston Timp,et al.  Single-molecule, full-length transcript sequencing provides insight into the extreme metabolism of the ruby-throated hummingbird Archilochus colubris , 2017, bioRxiv.

[45]  Arndt von Haeseler,et al.  An Enumerative Combinatorics Model for Fragmentation Patterns in RNA Sequencing Provides Insights into Nonuniformity of the Expected Fragment Starting-Point and Coverage Profile , 2017, J. Comput. Biol..

[46]  Paolo Piazza,et al.  Comprehensive comparison of Pacific Biosciences and Oxford Nanopore Technologies and their applications to transcriptome analysis , 2017, F1000Research.

[47]  Julia C. Engelmann,et al.  External calibration with Drosophila whole-cell spike-ins delivers absolute mRNA fold changes from human RNA-Seq and qPCR data. , 2017, BioTechniques.

[48]  Eun Ji Kim,et al.  Simulation-based comprehensive benchmarking of RNA-seq aligners , 2016, Nature Methods.

[49]  M. Rattray,et al.  Bayesian estimation of differential transcript usage from RNA-seq data , 2017, Statistical applications in genetics and molecular biology.

[50]  Pavithra Kumar,et al.  Understanding development and stem cells using single cell-based analyses of gene expression , 2017, Development.

[51]  Maria K. Jaakkola,et al.  Comparison of methods to detect differentially expressed genes between single-cell populations , 2016, Briefings Bioinform..

[52]  Lior Pachter,et al.  Differential analysis of RNA-seq incorporating quantification uncertainty , 2016, Nature Methods.

[53]  Walter L. Ruzzo,et al.  Isolator: accurate and stable analysis of isoform-level expression in RNA-Seq experiments , 2016, bioRxiv.

[54]  John Quackenbush,et al.  Smooth Quantile Normalization , 2016, bioRxiv.

[55]  A. Regev,et al.  Revealing the vectors of cellular identity with single-cell genomics , 2016, Nature Biotechnology.

[56]  Trevor Hastie,et al.  Computer Age Statistical Inference: Algorithms, Evidence, and Data Science , 2016 .

[57]  Patrik L. Ståhl,et al.  Visualization and analysis of gene expression in tissue sections by spatial transcriptomics , 2016, Science.

[58]  Tyson A. Clark,et al.  Unveiling the complexity of the maize transcriptome by single-molecule long-read sequencing , 2016, Nature Communications.

[59]  Faye D. Schilkey,et al.  A survey of the sorghum transcriptome using single-molecule long reads , 2016, Nature Communications.

[60]  Sarah C. Ayling,et al.  The Ensembl gene annotation system , 2016, Database J. Biol. Databases Curation.

[61]  Aaron T. L. Lun,et al.  From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline , 2016, F1000Research.

[62]  M. Robinson,et al.  DRIMSeq: a Dirichlet-multinomial framework for multivariate count outcomes in genomics , 2016, F1000Research.

[63]  T. Heskes,et al.  The statistical properties of gene-set analysis , 2016, Nature Reviews Genetics.

[64]  Judith B. Zaugg,et al.  Data-driven hypothesis weighting increases detection power in genome-scale multiple testing , 2016, Nature Methods.

[65]  Giuseppe Testa,et al.  RNAontheBENCH: computational and empirical resources for benchmarking RNAseq quantification and differential expression methods , 2016, Nucleic acids research.

[66]  Keegan D. Korthauer,et al.  A statistical approach for identifying differential distributions in single-cell RNA-seq experiments , 2016, Genome Biology.

[67]  Gideon Rechavi,et al.  RNA modifications: what have we learned and where are we headed? , 2016, Nature Reviews Genetics.

[68]  Lior Pachter,et al.  Near-optimal probabilistic RNA-seq quantification , 2016, Nature Biotechnology.

[69]  Dmitri D. Pervouchine,et al.  A benchmark for RNA-seq quantification pipelines , 2016, Genome Biology.

[70]  G. Pedersen,et al.  Clinical RNA sequencing in oncology: where are we? , 2016, Personalized medicine.

[71]  Jiannis Ragoussis,et al.  Benchmarking of the Oxford Nanopore MinION sequencing for quantitative and qualitative assessment of cDNA populations , 2016, Scientific Reports.

[72]  G. Barton,et al.  How many biological replicates are needed in an RNA-seq experiment and which differential expression tool should you use? , 2015, RNA.

[73]  Matthew Stephens,et al.  False discovery rates: a new deal , 2016, bioRxiv.

[74]  E. S. Quintana-Ortí,et al.  Highly sensitive and ultrafast read mapping for RNA-seq analysis , 2016, DNA research : an international journal for rapid publication of reports on genes and genomes.

[75]  Eran Elinav,et al.  Use of Metatranscriptomics in Microbiome Research , 2016, Bioinformatics and biology insights.

[76]  R. Irizarry,et al.  Modeling of RNA-seq fragment sequence bias reduces systematic errors in transcript abundance estimation , 2015, Nature Biotechnology.

[77]  M. Gonzalez-Garay Introduction to Isoform Sequencing Using Pacific Biosciences Technology (Iso-Seq) , 2016 .

[78]  M. Robinson,et al.  Differential analyses for RNA-seq: transcript-level estimates improve gene-level inferences , 2015, F1000Research.

[79]  Mark D. Robinson,et al.  Isoform prefiltering improves performance of count-based methods for analysis of differential transcript usage , 2016, Genome Biology.

[80]  Li Zhang,et al.  Improving RNA-Seq expression estimation by modeling isoform- and exon-specific read sequencing rate , 2015, BMC Bioinformatics.

[81]  Wolfgang Huber,et al.  RNA-Seq workflow: gene-level exploratory analysis and differential expression , 2015, F1000Research.

[82]  Kin-Fan Au,et al.  PacBio Sequencing and Its Applications , 2015, Genom. Proteom. Bioinform..

[83]  Cole Trapnell,et al.  Defining cell types and states with single-cell genomics , 2015, Genome research.

[84]  Eric Banks,et al.  Tools and best practices for data processing in allelic expression analysis , 2015, Genome Biology.

[85]  Mick Watson,et al.  Errors in RNA-Seq quantification affect genes of relevance to human disease , 2015, Genome Biology.

[86]  Jie Quan,et al.  Comparison of stranded and non-stranded RNA-seq transcriptome profiling and investigation of gene overlap , 2015, BMC Genomics.

[87]  Mihaela Zavolan,et al.  Comparative assessment of methods for the computational inference of transcript isoform abundance from RNA-seq data , 2015, Genome Biology.

[88]  Xiandong Meng,et al.  Widespread Polycistronic Transcripts in Fungi Revealed by Single-Molecule mRNA Sequencing , 2015, PloS one.

[89]  P. Linsley,et al.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data , 2015, Genome Biology.

[90]  Thomas Bonfert,et al.  ContextMap 2: fast and accurate context-based RNA-seq mapping , 2015, BMC Bioinformatics.

[91]  Knut Reinert,et al.  CIDANE: comprehensive isoform discovery and abundance estimation , 2015, Genome Biology.

[92]  Steven L Salzberg,et al.  HISAT: a fast spliced aligner with low memory requirements , 2015, Nature Methods.

[93]  T Laver,et al.  Assessing the performance of the Oxford Nanopore Technologies MinION , 2015, Biomolecular detection and quantification.

[94]  S. Salzberg,et al.  StringTie enables improved reconstruction of a transcriptome from RNA-seq reads , 2015, Nature Biotechnology.

[95]  Antti Honkela,et al.  Fast and accurate approximate inference of transcript expression from RNA-seq data , 2014, Bioinform..

[96]  Yue Wang,et al.  The evolution of nanopore sequencing , 2014, Front. Genet..

[97]  Paul Theodor Pyl,et al.  HTSeq—a Python framework to work with high-throughput sequencing data , 2014, bioRxiv.

[98]  M. Hemberg,et al.  Enhancer RNAs: a class of long noncoding RNAs synthesized at enhancers. , 2015, Cold Spring Harbor perspectives in biology.

[99]  Robert J. Weatheritt,et al.  A Highly Conserved Program of Neuronal Microexons Is Misregulated in Autistic Brains , 2014, Cell.

[100]  Lan Lin,et al.  rMATS: Robust and flexible detection of differential alternative splicing from replicate RNA-Seq data , 2014, Proceedings of the National Academy of Sciences.

[101]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[102]  Anders Krogh,et al.  Bayesian transcriptome assembly , 2014, Genome Biology.

[103]  Aaron R. Quinlan,et al.  A reference bacterial genome dataset generated on the MinION™ portable single-molecule nanopore sequencer , 2014, bioRxiv.

[104]  S. Dudoit,et al.  Normalization of RNA-seq data using factor analysis of control genes or samples , 2014, Nature Biotechnology.

[105]  David P. Kreil,et al.  A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control consortium , 2014, Nature Biotechnology.

[106]  Christopher R. Cabanski,et al.  cDNA hybrid capture improves transcriptome analysis on low-input and archived samples. , 2014, The Journal of molecular diagnostics : JMD.

[107]  J. Leek svaseq: removing batch effects and other unwanted noise from sequencing data , 2014, bioRxiv.

[108]  Donald Sharon,et al.  Defining a personal, allele-specific, and single-molecule long-read transcriptome , 2014, Proceedings of the National Academy of Sciences.

[109]  Scott T. Weiss,et al.  RNA-Seq Transcriptome Profiling Identifies CRISPLD2 as a Glucocorticoid Responsive Gene that Modulates Cytokine Function in Airway Smooth Muscle Cells , 2014, PloS one.

[110]  Wei Wang,et al.  RNA-Skim: a rapid method for RNA-Seq quantification at transcript level , 2014, Bioinform..

[111]  C. Perou,et al.  Comparison of RNA-Seq by poly (A) capture, ribosomal RNA depletion, and DNA microarray for expression profiling , 2014, BMC Genomics.

[112]  P. Kharchenko,et al.  Bayesian approach to single-cell differential expression analysis , 2014, Nature Methods.

[113]  Julien Mairal,et al.  Efficient RNA isoform identification and quantification from RNA-Seq data with network flows , 2014, Bioinform..

[114]  Marcel E Dinger,et al.  Targeted sequencing for gene discovery and quantification using RNA CaptureSeq , 2014, Nature Protocols.

[115]  Thomas C. Südhof,et al.  Cartography of neurexin alternative splicing mapped by single-molecule long-read mRNA sequencing , 2014, Proceedings of the National Academy of Sciences.

[116]  B. Williams,et al.  From single-cell to cell-pool transcriptomes: Stochasticity in gene expression and RNA splicing , 2014, Genome research.

[117]  Jie Zhou,et al.  RNA-seq differential expression studies: more sequence or more replication? , 2014, Bioinform..

[118]  Ernest Turro,et al.  Flexible analysis of RNA-seq data using mixed effects models , 2014, Bioinform..

[119]  Yan Guo,et al.  RNAseqPS: A Web Tool for Estimating Sample Size and Power for RNAseq Experiment , 2014, Cancer informatics.

[120]  Mark D. Robinson,et al.  Robustly detecting differential expression in RNA sequencing data using observation weights , 2013, Nucleic acids research.

[121]  Rob Patro,et al.  Sailfish enables alignment-free isoform quantification from RNA-seq reads using lightweight algorithms , 2013, Nature Biotechnology.

[122]  Wei Shi,et al.  featureCounts: an efficient general purpose program for assigning sequence reads to genomic features , 2013, Bioinform..

[123]  Gael P. Alamancos,et al.  Methods to study splicing from high-throughput RNA sequencing data. , 2013, Methods in molecular biology.

[124]  Aaron T. L. Lun,et al.  Differential Expression Analysis of Complex RNA-seq Experiments Using edgeR , 2014 .

[125]  Mark A. van de Wiel,et al.  ShrinkBayes: a versatile R-package for analysis of count-based sequencing data in complex study designs , 2014, BMC Bioinformatics.

[126]  Charity W. Law,et al.  voom: precision weights unlock linear model analysis tools for RNA-seq read counts , 2014, Genome Biology.

[127]  Wing Hung Wong,et al.  Characterization of the human ESC transcriptome by hybrid sequencing , 2013, Proceedings of the National Academy of Sciences.

[128]  Steven N. Hart,et al.  Calculating Sample Size Estimates for RNA Sequencing Data , 2013, J. Comput. Biol..

[129]  J. Harrow,et al.  Systematic evaluation of spliced alignment programs for RNA-seq data , 2013, Nature Methods.

[130]  J. Harrow,et al.  Assessment of transcript reconstruction methods for RNA-seq , 2013, Nature Methods.

[131]  Donald Sharon,et al.  A single-molecule long-read survey of the human transcriptome , 2013, Nature Biotechnology.

[132]  Wei Zhang,et al.  Fusion genes in solid tumors: an emerging target for cancer diagnosis and treatment , 2013, Chinese journal of cancer.

[133]  Arndt von Haeseler,et al.  NextGenMap: fast and accurate read mapping in highly polymorphic genomes , 2013, Bioinform..

[134]  Nicolas Servant,et al.  A comprehensive evaluation of normalization methods for Illumina high-throughput RNA sequencing data analysis , 2013, Briefings Bioinform..

[135]  Jin Billy Li,et al.  Reliable identification of genomic variants from RNA-seq data. , 2013, American journal of human genetics.

[136]  Robert Tibshirani,et al.  Finding consistent patterns: A nonparametric approach for identifying differential expression in RNA-Seq data , 2013, Statistical methods in medical research.

[137]  Masao Nagasaki,et al.  TIGAR: transcript isoform abundance estimation method with gapped alignment of RNA-Seq data by variational Bayesian inference , 2013, Bioinform..

[138]  James R. Foulds,et al.  Stochastic collapsed variational Bayesian inference for latent Dirichlet allocation , 2013, KDD.

[139]  Wei Sun,et al.  eQTL Mapping Using RNA-seq Data , 2012, Statistics in Biosciences.

[140]  Ning Leng,et al.  EBSeq: an empirical Bayes hierarchical model for inference in RNA-seq experiments , 2013, Bioinform..

[141]  Alexandru I. Tomescu,et al.  A novel min-cost flow method for estimating transcript expression with RNA-Seq , 2013, BMC Bioinformatics.

[142]  W. Shi,et al.  The Subread aligner: fast, accurate and scalable read mapping by seed-and-vote , 2013, Nucleic acids research.

[143]  Sarah C. Emerson,et al.  Higher order asymptotics for negative binomial regression inferences from RNA-sequencing data , 2013, Statistical applications in genetics and molecular biology.

[144]  Gabor T. Marth,et al.  Scotty: a web tool for designing RNA-Seq experiments to measure differential gene expression , 2013, Bioinform..

[145]  Orion J. Buske,et al.  iReckon: Simultaneous isoform discovery and abundance estimation from RNA-seq data , 2013, Genome research.

[146]  J. Fak,et al.  NOVA-dependent regulation of cryptic NMD exons controls synaptic protein levels after seizure , 2013, eLife.

[147]  L. Pachter,et al.  Streaming fragment assignment for real-time analysis of sequencing experiments , 2012, Nature Methods.

[148]  Hao Wu,et al.  A new shrinkage estimator for dispersion improves differential expression detection in RNA-seq data , 2012, Biostatistics.

[149]  A. W. van der Vaart,et al.  Bayesian analysis of RNA sequencing data by estimating multiple shrinkage priors. , 2013, Biostatistics.

[150]  Thomas R. Gingeras,et al.  STAR: ultrafast universal RNA-seq aligner , 2013, Bioinform..

[151]  R. Sandberg Entering the era of single-cell transcriptomics in biology and medicine , 2013, Nature Methods.

[152]  N. Lennon,et al.  Characterizing and measuring bias in sequence data , 2013, Genome Biology.

[153]  Charlotte Soneson,et al.  A comparison of methods for differential expression analysis of RNA-seq data , 2013, BMC Bioinformatics.

[154]  David A. Orlando,et al.  Revisiting Global Gene Expression Analysis , 2012, Cell.

[155]  B. Williams,et al.  RNA editing in the human ENCODE RNA-seq data , 2012, Genome research.

[156]  A. Westermann,et al.  Dual RNA-seq of pathogen and host , 2012, Nature Reviews Microbiology.

[157]  Mauricio O. Carneiro,et al.  Pacific biosciences sequencing technology for genotyping and variation discovery in human data , 2012, BMC Genomics.

[158]  W. Huber,et al.  Detecting differential usage of exons from RNA-seq data , 2012, Genome research.

[159]  Brendan J. Frey,et al.  Challenges in estimating percent inclusion of alternatively spliced junctions from RNA-seq data , 2012, BMC Bioinformatics.

[160]  David R. Kelley,et al.  Differential gene and transcript expression analysis of RNA-seq experiments with TopHat and Cufflinks , 2012, Nature Protocols.

[161]  Charles Gawad,et al.  Circular RNAs Are the Predominant Transcript Isoform from Hundreds of Human Genes in Diverse Cell Types , 2012, PloS one.

[162]  Atul J. Butte,et al.  Ten Years of Pathway Analysis: Current Approaches and Outstanding Challenges , 2012, PLoS Comput. Biol..

[163]  Davis J. McCarthy,et al.  Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation , 2012, Nucleic acids research.

[164]  K. Hansen,et al.  Removing technical variability in RNA-seq data using conditional quantile normalization , 2012, Biostatistics.

[165]  Steven P Lund,et al.  Statistical Applications in Genetics and Molecular Biology Detecting Differential Expression in RNA-sequence Data Using Quasi-likelihood with Shrunken Dispersion Estimates , 2012 .

[166]  Antti Honkela,et al.  Identifying differentially expressed transcripts from RNA-seq data with biological variation , 2011, Bioinform..

[167]  S. Salzberg,et al.  Repetitive DNA and next-generation sequencing: computational challenges and solutions , 2011, Nature Reviews Genetics.

[168]  Sandrine Dudoit,et al.  GC-Content Normalization for RNA-Seq Data , 2011, BMC Bioinformatics.

[169]  Lira Mamanova,et al.  Low-bias, strand-specific transcriptome Illumina sequencing by on-flowcell reverse transcription (FRT-seq) , 2011, Nature Protocols.

[170]  Zhong Wang,et al.  Next-generation transcriptome assembly , 2011, Nature Reviews Genetics.

[171]  Colin N. Dewey,et al.  RSEM: accurate transcript quantification from RNA-Seq data with or without a reference genome , 2011, BMC Bioinformatics.

[172]  Jeff H. Chang,et al.  The NBP Negative Binomial Model for Assessing Differential Gene Expression from RNA-Seq , 2011 .

[173]  Tao Jiang,et al.  IsoLasso: A LASSO Regression Approach to RNA-Seq Based Transcriptome Assembly - (Extended Abstract) , 2011, RECOMB.

[174]  Cole Trapnell,et al.  Improving RNA-Seq expression estimates by correcting for fragment bias , 2011, Genome Biology.

[175]  L. Coin,et al.  Haplotype and isoform specific expression estimation using multi-mapping RNA-seq reads , 2011, Genome Biology.

[176]  Ion I. Mandoiu,et al.  Estimation of alternative splicing isoform frequencies from RNA-Seq data , 2010, Algorithms for Molecular Biology.

[177]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[178]  N. Friedman,et al.  Comprehensive comparative analysis of strand-specific RNA sequencing methods , 2010, Nature Methods.

[179]  S. Turner,et al.  A flexible and efficient template format for circular consensus sequencing and SNP detection , 2010, Nucleic acids research.

[180]  R. Gentleman,et al.  Independent filtering increases detection power for high-throughput experiments , 2010, Proceedings of the National Academy of Sciences.

[181]  Cole Trapnell,et al.  Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. , 2010, Nature biotechnology.

[182]  K. Hansen,et al.  Biases in Illumina transcriptome sequencing caused by random hexamer priming , 2010, Nucleic acids research.

[183]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[184]  Marcel H. Schulz,et al.  Prediction of alternative isoforms from exon expression levels in RNA-Seq experiments , 2010, Nucleic acids research.

[185]  Serban Nacu,et al.  Fast and SNP-tolerant detection of complex variants and splicing in short reads , 2010, Bioinform..

[186]  M. Stephens,et al.  Sex-specific and lineage-specific alternative splicing in primates. , 2010, Genome research.

[187]  Colin N. Dewey,et al.  RNA-Seq gene expression estimation with read mapping uncertainty , 2009, Bioinform..

[188]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[189]  S. Turner,et al.  Real-time DNA sequencing from single polymerase molecules. , 2010, Methods in enzymology.

[190]  M. Robinson,et al.  A scaling normalization method for differential expression analysis of RNA-seq data , 2010, Genome Biology.

[191]  T. Borodina,et al.  Transcriptome analysis by strand-specific sequencing of complementary DNA , 2009, Nucleic acids research.

[192]  O. Cappé,et al.  On‐line expectation–maximization algorithm for latent data models , 2009 .

[193]  A. Oshlack,et al.  Transcript length bias in RNA-seq data confounds systems biology , 2009, Biology Direct.

[194]  Gregory R. Grant,et al.  A flexible two-stage procedure for identifying gene sets that are differentially expressed , 2009, Bioinform..

[195]  H. Rue,et al.  Approximate Bayesian inference for latent Gaussian models by using integrated nested Laplace approximations , 2009 .

[196]  Wing Hung Wong,et al.  Statistical inferences for isoform expression in RNA-Seq , 2009, Bioinform..

[197]  Lior Pachter,et al.  Sequence Analysis , 2020, Definitions.

[198]  Gordon K. Smyth,et al.  Testing significance relative to a fold-change threshold is a TREAT , 2009, Bioinform..

[199]  Cole Trapnell,et al.  Ultrafast and memory-efficient alignment of short DNA sequences to the human genome , 2009, Genome Biology.

[200]  Nancy F. Hansen,et al.  Accurate Whole Human Genome Sequencing using Reversible Terminator Chemistry , 2008, Nature.

[201]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[202]  B. Williams,et al.  Mapping and quantifying mammalian transcriptomes by RNA-Seq , 2008, Nature Methods.

[203]  I. Goodhead,et al.  Dynamic repertoire of a eukaryotic transcriptome surveyed at single-nucleotide resolution , 2008, Nature.

[204]  Mark D. Robinson,et al.  FIRMA: a method for detection of alternative splicing from exon array data , 2008, Bioinform..

[205]  M. Gerstein,et al.  The Transcriptional Landscape of the Yeast Genome Defined by RNA Sequencing , 2008, Science.

[206]  S. Ranade,et al.  Stem cell transcriptome profiling via massive-scale mRNA sequencing , 2008, Nature Methods.

[207]  R. Lister,et al.  Highly Integrated Single-Base Resolution Maps of the Epigenome in Arabidopsis , 2008, Cell.

[208]  M. Robinson,et al.  Small-sample estimation of negative binomial dispersion, with applications to SAGE data. , 2007, Biostatistics.

[209]  Mark D. Robinson,et al.  Moderated statistical tests for assessing differences in tag abundance , 2007, Bioinform..

[210]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[211]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[212]  Nicholas J. Turro,et al.  Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators , 2006, Proceedings of the National Academy of Sciences.

[213]  Yi Xing,et al.  An expectation-maximization algorithm for probabilistic reconstructions of full-length isoforms from splice graphs , 2006, Nucleic acids research.

[214]  Thomas D. Wu,et al.  GMAP: a genomic mapping and alignment program for mRNA and EST sequence , 2005, Bioinform..

[215]  Gordon K Smyth,et al.  Statistical Applications in Genetics and Molecular Biology Linear Models and Empirical Bayes Methods for Assessing Differential Expression in Microarray Experiments , 2011 .

[216]  Ch. Roland,et al.  Squared Extrapolation Methods (SQUAREM): A New Class of Simple and Efficient Numerical Schemes for Accelerating the Convergence of the EM Algorithm , 2004 .

[217]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[218]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[219]  S. Turner,et al.  Zero-Mode Waveguides for Single-Molecule Analysis at High Concentrations , 2003, Science.

[220]  Eric R. Ziegel,et al.  Generalized Linear Models , 2002, Technometrics.

[221]  H. Finner,et al.  On the False Discovery Rate and Expected Type I Errors , 2001 .

[222]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[223]  R. Tibshirani,et al.  Significance analysis of microarrays applied to the ionizing radiation response , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[224]  Christian A. Rees,et al.  Molecular portraits of human breast tumours , 2000, Nature.

[225]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[226]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[227]  D. Cox,et al.  Parameter Orthogonality and Approximate Conditional Inference , 1987 .