Trendy: segmented regression analysis of expression dynamics in high-throughput ordered profiling experiments

BackgroundHigh-throughput expression profiling experiments with ordered conditions (e.g. time-course or spatial-course) are becoming more common for studying detailed differentiation processes or spatial patterns. Identifying dynamic changes at both the individual gene and whole transcriptome level can provide important insights about genes, pathways, and critical time points.ResultsWe present an R package, Trendy, which utilizes segmented regression models to simultaneously characterize each gene’s expression pattern and summarize overall dynamic activity in ordered condition experiments. For each gene, Trendy finds the optimal segmented regression model and provides the location and direction of dynamic changes in expression. We demonstrate the utility of Trendy to provide biologically relevant results on both microarray and RNA-sequencing (RNA-seq) datasets.ConclusionsTrendy is a flexible R package which characterizes gene-specific expression patterns and summarizes changes of global dynamics over ordered conditions. Trendy is freely available on Bioconductor with a full vignette at https://bioconductor.org/packages/release/bioc/html/Trendy.html.

[1]  J. Mesirov,et al.  The Molecular Signatures Database (MSigDB) hallmark gene set collection. , 2015, Cell systems.

[2]  B. Di Camillo,et al.  FunPat: function-based pattern analysis on RNA-seq time series data , 2015, BMC Genomics.

[3]  Yves Moreau,et al.  ISMB/ECCB 2015 , 2015, Bioinform..

[4]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[5]  J. Mesirov,et al.  The Molecular Signatures Database Hallmark Gene Set Collection , 2015 .

[6]  M. Newton,et al.  Random-set methods identify distinct aspects of the enrichment signal in gene-set analysis , 2007, 0708.4350.

[7]  Ana Conesa,et al.  Gene expression maSigPro : a method to identify significantly differential expression profiles in time-course microarray experiments , 2006 .

[8]  M. Thon,et al.  Identification of horizontally transferred genes in the genus Colletotrichum reveals a steady tempo of bacterial to fungal gene transfer , 2015, BMC Genomics.

[9]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[10]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[11]  Ziv Bar-Joseph,et al.  SMARTS: reconstructing disease response networks from multiple individuals using time series gene expression data , 2015, Bioinform..

[12]  Colin N. Dewey,et al.  Analysis of embryonic development in the unsequenced axolotl: Waves of transcriptomic upheaval and stability. , 2017, Developmental biology.

[13]  Di Wu,et al.  Statistical inference for time course RNA-Seq data using a negative binomial mixed-effect model , 2016, BMC Bioinformatics.

[14]  Juancarlos Chan,et al.  Gene Ontology Consortium: going forward , 2014, Nucleic Acids Res..

[15]  Brian E. McIntosh,et al.  Species-specific developmental timing is maintained by pluripotent stem cells ex utero. , 2017, Developmental biology.

[16]  Daniel Spies,et al.  Dynamics in Transcriptomics: Advancements in RNA-seq Time Course and Downstream Analysis , 2015, Computational and structural biotechnology journal.

[17]  Wenxuan Zhong,et al.  A data-driven clustering method for time course gene expression data , 2006, Nucleic acids research.

[18]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[19]  James A. Thomson,et al.  A cost-effective RNA sequencing protocol for large-scale gene expression studies , 2015, Scientific Reports.

[20]  Rafael A Irizarry,et al.  Exploration, normalization, and summaries of high density oligonucleotide array probe level data. , 2003, Biostatistics.

[21]  W. Huber,et al.  which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. MAnorm: a robust model for quantitative comparison of ChIP-Seq data sets , 2011 .

[22]  C. Ball,et al.  Identification of genes periodically expressed in the human cell cycle and their expression in tumors. , 2002, Molecular biology of the cell.

[23]  C. Elsik The pea aphid genome sequence brings theories of insect defense into question , 2010, Genome Biology.

[24]  R. Stewart,et al.  Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm , 2016, Genome Biology.

[25]  Yuan Li,et al.  EBSeq-HMM: a Bayesian approach for identifying gene-expression changes in ordered RNA-seq experiments , 2015, Bioinform..

[26]  V. Muggeo Estimating regression models with unknown break‐points , 2003, Statistics in medicine.

[27]  M. Muggeo,et al.  segmented: An R package to Fit Regression Models with Broken-Line Relationships , 2008 .

[28]  Daniel Spies,et al.  Comparative analysis of differential gene expression tools for RNA sequencing time course data , 2017, Briefings Bioinform..

[29]  Ning Leng,et al.  Oscope identifies oscillatory genes in unsynchronized single cell RNA-seq experiments , 2015, Nature Methods.

[30]  Christopher B. Burge,et al.  Methods for time series analysis of RNA-seq data with application to human Th17 cell differentiation , 2014, Bioinform..