DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data

BackgroundThe development of single-cell RNA sequencing has enabled profound discoveries in biology, ranging from the dissection of the composition of complex tissues to the identification of novel cell types and dynamics in some specialized cellular environments. However, the large-scale generation of single-cell RNA-seq (scRNA-seq) data collected at multiple time points remains a challenge to effective measurement gene expression patterns in transcriptome analysis.ResultsWe present an algorithm based on the Dynamic Time Warping score (DTWscore) combined with time-series data, that enables the detection of gene expression changes across scRNA-seq samples and recovery of potential cell types from complex mixtures of multiple cell types.ConclusionsThe DTWscore successfully classify cells of different types with the most highly variable genes from time-series scRNA-seq data. The study was confined to methods that are implemented and available within the R framework. Sample datasets and R packages are available at https://github.com/xiaoxiaoxier/DTWscore.

[1]  Marianthi Markatou,et al.  A Platform for Processing Expression of Short Time Series (PESTS) , 2011, BMC Bioinformatics.

[2]  Åsa K. Björklund,et al.  Smart-seq2 for sensitive full-length transcriptome profiling in single cells , 2013 .

[3]  A. Regev,et al.  Impulse Control: Temporal Dynamics in Gene Transcription , 2011, Cell.

[4]  W. Huber,et al.  Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 , 2014, Genome Biology.

[5]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[6]  C. Tyler-Smith,et al.  Ancient DNA and the rewriting of human history: be sparing with Occam’s razor , 2016, Genome Biology.

[7]  Mark D. Robinson,et al.  edgeR: a Bioconductor package for differential expression analysis of digital gene expression data , 2009, Bioinform..

[8]  Meinard Müller,et al.  Information retrieval for music and motion , 2007 .

[9]  Florian Markowetz,et al.  OncoNEM: inferring tumor evolution from single-cell sequencing data , 2016, Genome Biology.

[10]  Alexander van Oudenaarden,et al.  References and Notes Supporting Online Material Circadian Gating of the Cell Cycle Revealed in Single Cyanobacterial Cells , 2022 .

[11]  Gioele La Manno,et al.  Quantitative single-cell RNA-seq with unique molecular identifiers , 2013, Nature Methods.

[12]  G. McLachlan,et al.  The EM algorithm and extensions , 1996 .

[13]  S. Richardson,et al.  Beyond comparisons of means: understanding changes in gene expression at the single-cell level , 2016, bioRxiv.

[14]  Yan Mei,et al.  The RNA-binding protein hnRNPLL induces a T cell alternative splicing program delineated by differential intron retention in polyadenylated RNA , 2014, Genome Biology.

[15]  Iddo Friedberg,et al.  IPRStats: visualization of the functional potential of an InterProScan run , 2010, BMC Bioinformatics.

[16]  Ning Leng,et al.  Oscope identifies oscillatory genes in unsynchronized single cell RNA-seq experiments , 2015, Nature Methods.

[17]  Christian Buchta,et al.  Distance and Similarity Measures , 2015, Encyclopedia of Multimedia.

[18]  Adrian E. Raftery,et al.  MCLUST Version 3: An R Package for Normal Mixture Modeling and Model-Based Clustering , 2006 .

[19]  M. Stephens,et al.  RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. , 2008, Genome research.

[20]  F. Tang,et al.  Single-cell sequencing in stem cell biology , 2016, Genome Biology.

[21]  Ambuj Kumar,et al.  Computational analysis of genetic network involved in pancreatic cancer in human , 2011, BMC Bioinformatics.

[22]  Toni Giorgino,et al.  Computing and Visualizing Dynamic Time Warping Alignments in R: The dtw Package , 2009 .

[23]  Ronald G. Tompkins,et al.  Dissecting Inflammatory Complications in Critically Injured Patients by Within-Patient Gene Expression Changes: A Longitudinal Clinical Genomics Study , 2011, PLoS medicine.

[24]  Sean C. Bendall,et al.  Single-Cell Trajectory Detection Uncovers Progression and Regulatory Coordination in Human B Cell Development , 2014, Cell.

[25]  Elena Tsiporkova,et al.  Merging microarray cell synchronization experiments through curve alignment , 2007, Bioinform..

[26]  S. Linnarsson,et al.  Single-cell genomics: coming of age , 2016, Genome Biology.

[27]  Tommi S. Jaakkola,et al.  Continuous Representations of Time-Series Gene Expression Data , 2003, J. Comput. Biol..

[28]  Rona S. Gertner,et al.  Single cell RNA Seq reveals dynamic paracrine control of cellular variation , 2014, Nature.

[29]  I. Amit,et al.  Massively Parallel Single-Cell RNA-Seq for Marker-Free Decomposition of Tissues into Cell Types , 2014, Science.

[30]  Keegan D. Korthauer,et al.  A statistical approach for identifying differential distributions in single-cell RNA-seq experiments , 2016, Genome Biology.

[31]  Thomas J. Hardcastle,et al.  baySeq: Empirical Bayesian methods for identifying differential expression in sequence count data , 2010, BMC Bioinformatics.

[32]  Fabian J Theis,et al.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells , 2015, Nature Biotechnology.

[33]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[34]  P. Linsley,et al.  MAST: a flexible statistical framework for assessing transcriptional changes and characterizing heterogeneity in single-cell RNA sequencing data , 2015, Genome Biology.

[35]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[36]  P. Kharchenko,et al.  Bayesian approach to single-cell differential expression analysis , 2014, Nature Methods.

[37]  Wen Huang,et al.  MTML-msBayes: Approximate Bayesian comparative phylogeographic inference from multiple taxa and multiple loci with rate heterogeneity , 2011, BMC Bioinformatics.

[38]  N. Neff,et al.  Reconstructing lineage hierarchies of the distal lung epithelium using single cell RNA-seq , 2014, Nature.

[39]  Li Qian,et al.  SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data , 2016, Genome Biology.

[40]  Manuel Llinás,et al.  A Research Agenda for Malaria Eradication: Basic Science and Enabling Technologies , 2011, PLoS medicine.

[41]  Dennis B. Troup,et al.  NCBI GEO: archive for functional genomics data sets—10 years on , 2010, Nucleic Acids Res..

[42]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[43]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .