DCI: learning causal differences between gene regulatory networks

SUMMARY Designing interventions to control gene regulation necessitates modeling a gene regulatory network by a causal graph. Currently, large-scale expression datasets from different conditions, cell types, disease states and developmental time points are being collected. However, application of classical causal inference algorithms to infer gene regulatory networks based on such data is still challenging, requiring high sample sizes and computational resources. Here, we describe an algorithm that efficiently learns the differences in gene regulatory mechanisms between different conditions. Our difference causal inference (DCI) algorithm infers changes (i.e., edges that appeared, disappeared or changed weight) between two causal graphs given gene expression data from the two conditions. This algorithm is efficient in its use of samples and computation since it infers the differences between causal graphs directly without estimating each possibly large causal graph separately. We provide a user-friendly Python implementation of DCI and also enable the user to learn the most robust difference causal graph across different tuning parameters via stability selection. Finally, we show how to apply DCI to single-cell RNA-seq data from different conditions and cell states, and we also validate our algorithm by predicting the effects of interventions. AVAILABILITY AND IMPLEMENTATION Python package freely available at http://uhlerlab.github.io/causaldag/dci. SUPPLEMENTARY INFORMATION Supplementary information is available at Bioinformatics online.

[1]  R. Tibshirani,et al.  Sparse inverse covariance estimation with the graphical lasso. , 2008, Biostatistics.

[2]  Steve Horvath,et al.  WGCNA: an R package for weighted correlation network analysis , 2008, BMC Bioinformatics.

[3]  Tso-Jung Yen,et al.  Discussion on "Stability Selection" by Meinshausen and Buhlmann , 2010 .

[4]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[5]  Thomas M. Norman,et al.  Perturb-Seq: Dissecting Molecular Circuits with Scalable Single-Cell RNA Profiling of Pooled Genetic Screens , 2016, Cell.

[6]  Michael Mitzenmacher,et al.  Detecting Novel Associations in Large Data Sets , 2011, Science.

[7]  P. Spirtes,et al.  Review of Causal Discovery Methods Based on Graphical Models , 2019, Front. Genet..

[8]  Yuhao Wang,et al.  Direct Estimation of Differences in Causal Graphs , 2018, NeurIPS.

[9]  N. Meinshausen,et al.  Methods for causal inference from gene perturbation experiments and validation , 2016, Proceedings of the National Academy of Sciences.

[10]  T. Cai,et al.  Direct estimation of differential networks. , 2014, Biometrika.

[11]  André F. Rendeiro,et al.  Pooled CRISPR screening with single-cell transcriptome read-out , 2017, Nature Methods.

[12]  Ulf Leser,et al.  Comparative assessment of differential network analysis methods , 2016, Briefings Bioinform..

[13]  Kenji Fukumizu,et al.  Learning sparse structural changes in high-dimensional Markov networks , 2017, Behaviormetrika.

[14]  A. Fukushima DiffCorr: an R package to analyze and visualize differential correlations in biological networks. , 2013, Gene.

[15]  Haiyan Huang,et al.  Review on statistical methods for gene network reconstruction using expression data. , 2014, Journal of theoretical biology.