Deep learning of gene relationships from single cell time-course expression data

Motivation Time-course gene expression data has been widely used to infer regulatory and signaling relationships between genes. Most of the widely used methods for such analysis were developed for bulk expression data. Single cell RNA-Seq (scRNA-Seq) data offers several advantages including the large number of expression profiles available and the ability to focus on individual cells rather than averages. However, this data also raises new computational challenges. Results Using a novel encoding for scRNA-Seq expression data we develop deep learning methods for interaction prediction from time-course data. Our methods use a supervised framework which represents the data as a 3D tensor and train convolutional and recurrent neural networks (CNN and RNN) for predicting interactions. We tested our Time-course Deep Learning (TDL) models on five different time series scRNA-Seq datasets. As we show, TDL can accurately identify causal and regulatory gene-gene interactions and can also be used to assign new function to genes. TDL improves on prior methods for the above tasks and can be generally applied to new time series scRNA-Seq data. Availability and Implementation Freely available at https://github.com/xiaoyeye/TDL. Contact zivbj@cs.cmu.edu Supplementary information Supplementary data are available at XXX online.

[1]  Lukás Burget,et al.  Extensions of recurrent neural network language model , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  M. Cookson,et al.  The Endosomal–Lysosomal Pathway Is Dysregulated by APOE4 Expression in Vivo , 2017, Front. Neurosci..

[3]  Lin Song,et al.  Comparison of co-expression measures: mutual information, correlation, and model based indices , 2012, BMC Bioinformatics.

[4]  Neda Bagheri,et al.  Windowed Granger causal inference strategy improves discovery of gene regulatory networks , 2018, Proceedings of the National Academy of Sciences.

[5]  Cole Trapnell,et al.  The dynamics and regulators of cell fate decisions are revealed by pseudotemporal ordering of single cells , 2014, Nature Biotechnology.

[6]  I. Simon,et al.  Studying and modelling dynamic biological processes using time-series gene expression data , 2012, Nature Reviews Genetics.

[7]  Ziv Bar-Joseph,et al.  Reconstructing dynamic microRNA-regulated interaction networks , 2013, Proceedings of the National Academy of Sciences.

[8]  B. Zlokovic,et al.  Impaired vascular-mediated clearance of brain amyloid beta in Alzheimer’s disease: the role, regulation and restoration of LRP1 , 2015, Front. Aging Neurosci..

[9]  Clifford A. Meyer,et al.  Model-based Analysis of ChIP-Seq (MACS) , 2008, Genome Biology.

[10]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[11]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[12]  P. Geurts,et al.  Inferring Regulatory Networks from Expression Data Using Tree-Based Methods , 2010, PloS one.

[13]  Pierre Geurts,et al.  dynGENIE3: dynamical GENIE3 for the inference of gene networks from time series expression data , 2018, Scientific Reports.

[14]  Min Zou,et al.  A new dynamic Bayesian network (DBN) approach for identifying gene regulatory networks from time course microarray data , 2005, Bioinform..

[15]  Alexander E. Kel,et al.  GTRD: a database of transcription factor binding sites identified by ChIP-seq experiments , 2016, Nucleic Acids Res..

[16]  Duhee Bang,et al.  Multiplexed single-cell RNA-seq via transient barcoding for simultaneous expression profiling of various drug perturbations , 2019, Science Advances.

[17]  Christoph Hafemeister,et al.  Developmental diversification of cortical inhibitory interneurons , 2017, Nature.

[18]  Emery N. Brown,et al.  A Granger Causality Measure for Point Process Models of Ensemble Neural Spiking Activity , 2011, PLoS Comput. Biol..

[19]  M. Gerstein,et al.  Beyond synexpression relationships: local clustering of time-shifted and inverted gene expression profiles identifies new, biologically relevant interactions. , 2001, Journal of molecular biology.

[20]  Allon M. Klein,et al.  Droplet Barcoding for Single-Cell Transcriptomics Applied to Embryonic Stem Cells , 2015, Cell.

[21]  R. Stewart,et al.  Single-cell RNA-seq reveals novel regulators of human embryonic stem cell differentiation to definitive endoderm , 2016, Genome Biology.

[22]  Jun Li,et al.  LEAP: constructing gene co‐expression networks for single‐cell RNA‐sequencing data using pseudotime ordering , 2016, Bioinform..

[23]  Pablo Tamayo,et al.  Gene set enrichment analysis: A knowledge-based approach for interpreting genome-wide expression profiles , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[24]  Ziv Bar-Joseph,et al.  DREM 2.0: Improved reconstruction of dynamic regulatory networks from time-series expression data , 2012, BMC Systems Biology.

[25]  Rickard Sandberg,et al.  Single-Cell RNA-Seq Reveals Lineage and X Chromosome Dynamics in Human Preimplantation Embryos , 2016, Cell.

[26]  D. Geschwind,et al.  Transcriptome signature of the adult mouse choroid plexus , 2011, Fluids and Barriers of the CNS.

[27]  Diogo M. Camacho,et al.  Wisdom of crowds for robust gene network inference , 2012, Nature Methods.

[28]  Byunghan Lee,et al.  Deep learning in bioinformatics , 2016, Briefings Bioinform..

[29]  Andrew J. Hill,et al.  The single cell transcriptional landscape of mammalian organogenesis , 2019, Nature.

[30]  R. Mayor Cell fate decisions during development , 2019, Science.

[31]  Yadong Wang,et al.  DTWscore: differential expression and cell clustering analysis for time-series single-cell RNA-seq data , 2017, BMC Bioinformatics.

[32]  Jie Luo,et al.  Integrating Genetic and Gene Co-expression Analysis Identifies Gene Networks Involved in Alcohol and Stress Responses , 2018, Front. Mol. Neurosci..

[33]  Ziv Bar-Joseph,et al.  Deep learning for inferring gene relationships from single-cell expression data , 2019 .

[34]  Chika Yokota,et al.  Spatiotemporal structure of cell fate decisions in murine neural crest , 2019, Science.

[35]  Sepp Hochreiter,et al.  The Vanishing Gradient Problem During Learning Recurrent Neural Nets and Problem Solutions , 1998, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[36]  J. Anastasi,et al.  Epigenetic Control of Apolipoprotein E Expression Mediates Gender‐Specific Hematopoietic Regulation , 2015, Stem cells.

[37]  T. Mikkelsen,et al.  Dynamics of lineage commitment revealed by single-cell transcriptomics of differentiating embryonic stem cells , 2016, Nature Communications.

[38]  Dit-Yan Yeung,et al.  Convolutional LSTM Network: A Machine Learning Approach for Precipitation Nowcasting , 2015, NIPS.