Learning Asynchronous Boolean Networks From Single-Cell Data Using Multiobjective Cooperative Genetic Programming

Recent advances in high-throughput single-cell technologies provide new opportunities for computational modeling of gene regulatory networks (GRNs) with an unprecedented amount of gene expression data. Current studies on the Boolean network (BN) modeling of GRNs mostly depend on bulk time-series data and focus on the synchronous update scheme due to its computational simplicity and tractability. However, such synchrony is a strong and rarely biologically realistic assumption. In this study, we adopt the asynchronous update scheme instead and propose a novel framework called SgpNet to infer asynchronous BNs from single-cell data by formulating it into a multiobjective optimization problem. SgpNet aims to find BNs that can match the asynchronous state transition graph (STG) extracted from single-cell data and retain the sparsity of GRNs. To search the huge solution space efficiently, we encode each Boolean function as a tree in genetic programming and evolve all functions of a network simultaneously via cooperative coevolution. Besides, we develop a regulator preselection strategy in view of GRN sparsity to further enhance learning efficiency. An error threshold estimation heuristic is also proposed to ease tedious parameter tuning. SgpNet is compared with the state-of-the-art method on both synthetic data and experimental single-cell data. Results show that SgpNet achieves comparable inference accuracy, while it has far fewer parameters and eliminates artificial restrictions on the Boolean function structures. Furthermore, SgpNet can potentially scale to large networks via straightforward parallelization on multiple cores.