SINCERITIES: inferring gene regulatory networks from time-stamped single cell transcriptional expression profiles

Abstract Motivation Single cell transcriptional profiling opens up a new avenue in studying the functional role of cell-to-cell variability in physiological processes. The analysis of single cell expression profiles creates new challenges due to the distributive nature of the data and the stochastic dynamics of gene transcription process. The reconstruction of gene regulatory networks (GRNs) using single cell transcriptional profiles is particularly challenging, especially when directed gene-gene relationships are desired. Results We developed SINCERITIES (SINgle CEll Regularized Inference using TIme-stamped Expression profileS) for the inference of GRNs from single cell transcriptional profiles. We focused on time-stamped cross-sectional expression data, commonly generated from transcriptional profiling of single cells collected at multiple time points after cell stimulation. SINCERITIES recovers directed regulatory relationships among genes by employing regularized linear regression (ridge regression), using temporal changes in the distributions of gene expressions. Meanwhile, the modes of the gene regulations (activation and repression) come from partial correlation analyses between pairs of genes. We demonstrated the efficacy of SINCERITIES in inferring GRNs using in silico time-stamped single cell expression data and single cell transcriptional profiles of THP-1 monocytic human leukemia cells. The case studies showed that SINCERITIES could provide accurate GRN predictions, significantly better than other GRN inference algorithms such as TSNI, GENIE3 and JUMP3. Moreover, SINCERITIES has a low computational complexity and is amenable to problems of extremely large dimensionality. Finally, an application of SINCERITIES to single cell expression data of T2EC chicken erythrocytes pointed to BATF as a candidate novel regulator of erythroid development. Availability and implementation MATLAB and R version of SINCERITIES are freely available from the following websites: http://www.cabsel.ethz.ch/tools/sincerities.html and https://github.com/CABSEL/SINCERITIES. The single cell THP-1 and T2EC transcriptional profiles are available from the original publications (Kouno et al., 2013; Richard et al., 2016). The in silico single cell data are available on SINCERITIES websites. Supplementary information Supplementary data are available at Bioinformatics online.

[1]  E. Pierson,et al.  ZIFA: Dimensionality reduction for zero-inflated single-cell gene expression analysis , 2015, Genome Biology.

[2]  M. Stumpf,et al.  Gene Regulatory Network Inference , 2021, Systems Medicine.

[3]  Alexander van Oudenaarden,et al.  Stochastic Cytokine Expression Induces Mixed T Helper Cell States , 2013, PLoS biology.

[4]  J. Kawai,et al.  Building promoter aware transcriptional regulatory networks using siRNA perturbation and deepCAGE , 2010, Nucleic acids research.

[5]  Alex A. Pollen,et al.  Low-coverage single-cell mRNA sequencing reveals cellular heterogeneity and activated signaling pathways in developing cerebral cortex , 2014, Nature Biotechnology.

[6]  Christophe Simon,et al.  Regulatory interdependence of myeloid transcription factors revealed by Matrix RNAi analysis , 2009, Genome Biology.

[7]  Cole Trapnell,et al.  Single-cell transcriptome sequencing: recent advances and remaining challenges , 2016, F1000Research.

[8]  Bruce J. Aronow,et al.  ToppCluster: a multiple gene list feature analyzer for comparative enrichment clustering and network-based dissection of biological systems , 2010, Nucleic Acids Res..

[9]  Fabian J Theis,et al.  Decoding the Regulatory Network for Blood Development from Single-Cell Gene Expression Measurements , 2015, Nature Biotechnology.

[10]  Carsten Marr,et al.  Early myeloid lineage choice is not initiated by random PU.1 to GATA1 protein ratios , 2016, Nature.

[11]  Julio R. Banga,et al.  Inference of complex biological networks: distinguishability issues and optimization-based solutions , 2011, BMC Systems Biology.

[12]  Guido Sanguinetti,et al.  Combining tree-based and dynamical systems for the inference of gene regulatory networks , 2015, Bioinform..

[13]  Yuanhui Xiao,et al.  A fast algorithm for two-dimensional Kolmogorov-Smirnov two sample tests , 2017, Comput. Stat. Data Anal..

[14]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[15]  Carsten Peterson,et al.  Single-Cell Network Analysis Identifies DDIT3 as a Nodal Lineage Regulator in Hematopoiesis , 2015, Cell reports.

[16]  Lani F. Wu,et al.  Cellular Heterogeneity: Do Differences Make a Difference? , 2010, Cell.

[17]  Rudiyanto Gunawan,et al.  Single-Cell-Based Analysis Highlights a Surge in Cell-to-Cell Molecular Variability Preceding Irreversible Commitment in a Differentiation Process , 2016, PLoS biology.

[18]  A. G. de la Fuente,et al.  From Knockouts to Networks: Establishing Direct Cause-Effect Relationships through Graph Analysis , 2010, PloS one.

[19]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[20]  K. Murphy,et al.  Specificity through cooperation: BATF–IRF interactions control immune-regulatory networks , 2013, Nature Reviews Immunology.

[21]  Jing Guo,et al.  Single-cell transcriptional analysis to uncover regulatory circuits driving cell fate decisions in early mouse development , 2015, Bioinform..

[22]  Hannah H. Chang,et al.  Transcriptome-wide noise controls lineage choice in mammalian progenitor cells , 2008, Nature.

[23]  Rudiyanto Gunawan,et al.  Ensemble Inference and Inferability of Gene Regulatory Networks , 2014, PloS one.

[24]  Rona S. Gertner,et al.  Single cell RNA Seq reveals dynamic paracrine control of cellular variation , 2014, Nature.

[25]  Do-Hyun Nam,et al.  Single-cell mRNA sequencing identifies subclonal heterogeneity in anti-cancer drug responses of lung adenocarcinoma cells , 2015, Genome Biology.

[26]  Wang Hai A Fast Algorithm for Two-dimensional Otsu Adaptive Threshold Algorithm , 2007 .

[27]  Aviv Regev,et al.  Deconstructing transcriptional heterogeneity in pluripotent stem cells , 2014, Nature.

[28]  Berthold Göttgens,et al.  BTR: training asynchronous Boolean models using single-cell expression data , 2016, BMC Bioinformatics.

[29]  Fabian J. Theis,et al.  Reconstructing gene regulatory dynamics from high-dimensional single-cell snapshot data , 2015, Bioinform..

[30]  T. W. Anderson,et al.  Asymptotic Theory of Certain "Goodness of Fit" Criteria Based on Stochastic Processes , 1952 .

[31]  Stéphane Lafon,et al.  Diffusion maps , 2006 .

[32]  E. H. Simpson,et al.  The Interpretation of Interaction in Contingency Tables , 1951 .

[33]  E. Marco,et al.  Bifurcation analysis of single-cell gene expression data reveals epigenetic landscape , 2014, Proceedings of the National Academy of Sciences.

[34]  T. W. Anderson On the Distribution of the Two-Sample Cramer-von Mises Criterion , 1962 .

[35]  Olivier Gandrillon,et al.  The MEK-1/ERKs signalling pathway is differentially involved in the self-renewal of early and late avian erythroid progenitor cells , 2003, Oncogene.

[36]  R. Stewart,et al.  Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm , 2016, Genome Biology.

[37]  Rudiyanto Gunawan,et al.  Gene Regulatory Network Inference Using Time-Stamped Cross-Sectional Single Cell Expression Data , 2016 .

[38]  P. Yu,et al.  Time-variant clustering model for understanding cell fate decisions , 2014, Proceedings of the National Academy of Sciences.

[39]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[40]  Avi Ma'ayan,et al.  Enrichr: interactive and collaborative HTML5 gene list enrichment analysis tool , 2013, BMC Bioinformatics.

[41]  Ron Kohavi,et al.  A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection , 1995, IJCAI.

[42]  Chen Xu,et al.  Identification of cell types from single-cell transcriptomes using a novel clustering method , 2015, Bioinform..

[43]  Trevor Hastie,et al.  Regularization Paths for Generalized Linear Models via Coordinate Descent. , 2010, Journal of statistical software.

[44]  Jay W. Shin,et al.  Temporal dynamics and transcriptional control using single-cell gene expression analysis , 2013, Genome Biology.

[45]  F. Massey The Kolmogorov-Smirnov Test for Goodness of Fit , 1951 .

[46]  Hirokazu Yanagihara,et al.  An unbiased Cp criterion for multivariate ridge regression , 2010, J. Multivar. Anal..

[47]  S. Richardson,et al.  Beyond comparisons of means: understanding changes in gene expression at the single-cell level , 2016, Genome Biology.

[48]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[49]  Hongkai Ji,et al.  TSCAN: Pseudo-time reconstruction and evaluation in single-cell RNA-seq analysis , 2016, Nucleic acids research.

[50]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[51]  Martin Pieprzyk,et al.  Fluidigm Dynamic Arrays provide a platform for single-cell gene expression analysis , 2009 .

[52]  Sean C. Bendall,et al.  Single-Cell Trajectory Detection Uncovers Progression and Regulatory Coordination in Human B Cell Development , 2014, Cell.

[53]  Hisanori Kiryu,et al.  SCODE: an efficient regulatory network inference algorithm from single-cell RNA-Seq during differentiation , 2016, bioRxiv.

[54]  M. Telen Red blood cell surface adhesion molecules: their possible roles in normal human physiology and disease. , 2000, Seminars in hematology.

[55]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..

[56]  Raul H. C. Lopes,et al.  A two-dimensional Kolmogorov-Smirnov test , 2009 .

[57]  Richard A. Muscat,et al.  Scaling single cell transcriptomics through split pool barcoding , 2017, bioRxiv.

[58]  Sean C. Bendall,et al.  viSNE enables visualization of high dimensional single-cell data and reveals phenotypic heterogeneity of leukemia , 2013, Nature Biotechnology.

[59]  Fabian J. Theis,et al.  Probabilistic PCA of censored data: accounting for uncertainties in the visualization of high-throughput single-cell qPCR data , 2014, Bioinform..

[60]  C. Granger Investigating causal relations by econometric models and cross-spectral methods , 1969 .

[61]  J. Peacock Two-dimensional goodness-of-fit testing in astronomy , 1983 .

[62]  R. Zamar,et al.  A multivariate Kolmogorov-Smirnov test of goodness of fit , 1997 .

[63]  Gábor Balázsi,et al.  Network of mutually repressive metastasis regulators can promote cell heterogeneity and metastatic transitions , 2014, Proceedings of the National Academy of Sciences.

[64]  Rhonda Bacher,et al.  Design and computational analysis of single-cell RNA-sequencing experiments , 2016, Genome Biology.

[65]  Carsten Peterson,et al.  Transcriptional Regulation of Lineage Commitment - A Stochastic Model of Cell Fate Decisions , 2013, PLoS Comput. Biol..

[66]  A. M. Arias,et al.  Transition states and cell fate decisions in epigenetic landscapes , 2016, Nature Reviews Genetics.

[67]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[68]  Fabian J. Theis,et al.  Diffusion maps for high-dimensional single-cell analysis of differentiation data , 2015, Bioinform..

[69]  Desmond J. Higham,et al.  An Algorithmic Introduction to Numerical Simulation of Stochastic Differential Equations , 2001, SIAM Rev..

[70]  H. Beug,et al.  TGF‐β cooperates with TGF‐α to induce the self–renewal of normal erythrocytic progenitors: evidence for an autocrine mechanism , 1999, The EMBO journal.

[71]  S. Teichmann,et al.  Computational and analytical challenges in single-cell transcriptomics , 2015, Nature Reviews Genetics.

[72]  Dario Floreano,et al.  GeneNetWeaver: in silico benchmark generation and performance profiling of network inference methods , 2011, Bioinform..

[73]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[74]  Guy S. Salvesen,et al.  SnapShot: Caspases , 2011, Cell.

[75]  D. Wilkinson Stochastic modelling for quantitative description of heterogeneous biological systems , 2009, Nature Reviews Genetics.

[76]  Olivier Gandrillon,et al.  On the spontaneous stochastic dynamics of a single gene: complexity of the molecular interplay at the promoter , 2010, BMC Systems Biology.

[77]  Stuart H. Orkin,et al.  Developmental and species-divergent globin switching are driven by BCL11A , 2009, Nature.

[78]  G. Fasano,et al.  A multidimensional version of the Kolmogorov–Smirnov test , 1987 .

[79]  Fabian J Theis,et al.  Computational analysis of cell-to-cell heterogeneity in single-cell RNA-sequencing data reveals hidden subpopulations of cells , 2015, Nature Biotechnology.