A comparison of single-cell trajectory inference methods

Trajectory inference approaches analyze genome-wide omics data from thousands of single cells and computationally infer the order of these cells along developmental trajectories. Although more than 70 trajectory inference tools have already been developed, it is challenging to compare their performance because the input they require and output models they produce vary substantially. Here, we benchmark 45 of these methods on 110 real and 229 synthetic datasets for cellular ordering, topology, scalability and usability. Our results highlight the complementarity of existing tools, and that the choice of method should depend mostly on the dataset dimensions and trajectory topology. Based on these results, we develop a set of guidelines to help users select the best method for their dataset. Our freely available data and evaluation pipeline (https://benchmark.dynverse.org) will aid in the development of improved tools designed to analyze increasingly large and complex single-cell datasets.The authors comprehensively benchmark the accuracy, scalability, stability and usability of 45 single-cell trajectory inference methods.

[1]  L. Steinmetz,et al.  Human haematopoietic stem cell lineage commitment is a continuous process , 2017, Nature Cell Biology.

[2]  Y. Saeys,et al.  Computational methods for trajectory inference from single‐cell transcriptomics , 2016, European journal of immunology.

[3]  Max Endele,et al.  Quantitative single-cell approaches to stem cell research. , 2014, Cell stem cell.

[4]  Johannes Söding,et al.  PROSSTT: probabilistic simulation of single-cell RNA-seq data for complex differentiation processes , 2018, bioRxiv.

[5]  Yvan Saeys,et al.  A comprehensive evaluation of module detection methods for gene expression data , 2018, Nature Communications.

[6]  Principal Investigators,et al.  Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris , 2018 .

[7]  Sean C. Bendall,et al.  Single-Cell Trajectory Detection Uncovers Progression and Regulatory Coordination in Human B Cell Development , 2014, Cell.

[8]  A. Regev,et al.  Scaling single-cell genomics from phenomenology to mechanism , 2017, Nature.

[9]  James T. Webber,et al.  Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris , 2018, Nature.

[10]  Fabian J Theis,et al.  Single cells make big data: New challenges and opportunities in transcriptomics , 2017 .

[11]  Michael Poidinger,et al.  Identification of cDC1- and cDC2-committed DC progenitors reveals early lineage priming at the common DC progenitor stage in the bone marrow , 2015, Nature Immunology.

[12]  Hisanori Kiryu,et al.  SCOUP: a probabilistic model based on the Ornstein–Uhlenbeck process to analyze single-cell expression data during differentiation , 2016, BMC Bioinformatics.

[13]  Eleazar Eskin,et al.  Challenges and recommendations to improve installability and archival stability of omics computational tools , 2018 .

[14]  Luke Zappia,et al.  Exploring the single-cell RNA-seq analysis landscape with the scRNA-tools database , 2017, bioRxiv.

[15]  Sean C. Bendall,et al.  Wishbone identifies bifurcating developmental trajectories from single-cell data , 2016, Nature Biotechnology.

[16]  R. Norel,et al.  The self-assessment trap: can we all be better than average? , 2011, Molecular systems biology.

[17]  Jay W. Shin,et al.  Temporal dynamics and transcriptional control using single-cell gene expression analysis , 2013, Genome Biology.

[18]  Li Qian,et al.  SLICER: inferring branched, nonlinear cellular trajectories from single cell RNA-seq data , 2016, Genome Biology.

[19]  Andrew C. Adey,et al.  Joint profiling of chromatin accessibility and gene expression in thousands of single cells , 2018, Science.

[20]  Vincent J. Henry,et al.  OMICtools: an informative directory for multi-omic data analysis , 2014, Database J. Biol. Databases Curation.

[21]  Andreas Ziegler,et al.  ranger: A Fast Implementation of Random Forests for High Dimensional Data in C++ and R , 2015, 1508.04409.

[22]  Yvan Saeys,et al.  A comparison of single-cell trajectory inference methods: towards more accurate and robust tools , 2018, bioRxiv.

[23]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[24]  David W. Nauen,et al.  Single-Cell RNA-Seq with Waterfall Reveals Molecular Cascades underlying Adult Neurogenesis. , 2015, Cell stem cell.

[25]  Evan W. Newell,et al.  Mapping the human DC lineage through the integration of high-dimensional techniques , 2017, Science.

[26]  Luyi Tian,et al.  scRNA-seq mixology: towards better benchmarking of single cell RNA-seq protocols and analysis methods , 2018, bioRxiv.

[27]  Fabian J Theis,et al.  PAGA: graph abstraction reconciles clustering with trajectory inference through a topology preserving map of single cells , 2019, Genome Biology.

[28]  Fabian J Theis,et al.  Diffusion pseudotime robustly reconstructs lineage branching , 2016, Nature Methods.

[29]  S. Teichmann,et al.  Exponential scaling of single-cell RNA-seq in the past decade , 2017, Nature Protocols.

[30]  David van Dijk,et al.  Manifold learning-based methods for analyzing single-cell RNA-sequencing data , 2018 .

[31]  Hongkai Ji,et al.  TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis , 2016, Nucleic acids research.

[32]  Eleazar Eskin,et al.  A comprehensive analysis of the usability and archival stability of omics computational tools and resources , 2018, bioRxiv.

[33]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[34]  Simon N. Wood,et al.  Shape constrained additive models , 2015, Stat. Comput..

[35]  Morgan Taschuk,et al.  Ten simple rules for making research software more robust , 2016, PLoS Comput. Biol..

[36]  Kieran R. Campbell,et al.  Probabilistic modeling of bifurcations in single-cell gene expression data using a Bayesian mixture of factor analyzers , 2017, Wellcome open research.

[37]  Rafael C. Jimenez,et al.  Top 10 metrics for life science software good practices , 2016, F1000Research.

[38]  Brett K. Beaulieu-Jones,et al.  Reproducibility of computational workflows is automated using continuous analysis , 2017, Nature Biotechnology.

[39]  Ian M. Mitchell,et al.  Best Practices for Scientific Computing , 2012, PLoS biology.

[40]  J. Aerts,et al.  SCENIC: Single-cell regulatory network inference and clustering , 2017, Nature Methods.

[41]  Russell B. Fletcher,et al.  Slingshot: cell lineage and pseudotime inference for single-cell transcriptomics , 2017, BMC Genomics.

[42]  Y. Saeys,et al.  SCORPIUS improves trajectory inference and identifies novel modules in dendritic cell development , 2016, bioRxiv.

[43]  Rui Jiang,et al.  Reconstructing cell cycle pseudo time-series via single-cell transcriptome data , 2017, Nature Communications.

[44]  Fabian J Theis,et al.  The Human Cell Atlas , 2017, bioRxiv.

[45]  Francesca Mulas,et al.  Pseudotemporal Ordering of Single Cells Reveals Metabolic Control of Postnatal β Cell Proliferation. , 2017, Cell metabolism.

[46]  Cesare Furlanello,et al.  The HIM glocal metric and kernel for network comparison and classification , 2012, 2015 IEEE International Conference on Data Science and Advanced Analytics (DSAA).

[47]  Davis J. McCarthy,et al.  A step-by-step workflow for low-level analysis of single-cell RNA-seq data with Bioconductor , 2016, F1000Research.

[48]  Hannah A. Pliner,et al.  Reversed graph embedding resolves complex single-cell trajectories , 2017, Nature Methods.

[49]  Kieran R. Campbell,et al.  Bayesian Gaussian Process Latent Variable Models for pseudotime inference in single-cell RNA-seq data , 2015, bioRxiv.

[50]  Erik Sundström,et al.  RNA velocity of single cells , 2018, Nature.

[51]  Cole Trapnell,et al.  Defining cell types and states with single-cell genomics , 2015, Genome research.

[52]  Koji Tsuda,et al.  CellTree: an R/bioconductor package to infer the hierarchical structure of cell populations from single-cell RNA-seq data , 2016, BMC Bioinformatics.

[53]  A. Oshlack,et al.  Splatter: simulation of single-cell RNA sequencing data , 2017, Genome Biology.

[54]  S. Orkin,et al.  Mapping the Mouse Cell Atlas by Microwell-Seq , 2018, Cell.