A two-stage methodology for gene regulatory network extraction from time-course gene expression data

The discovery of gene regulatory networks (GRN) from time-course gene expression data (gene trajectory data) is useful for (1) identifying important genes in relation to a disease or a biological function; (2) gaining an understanding on the dynamic interaction between genes; (3) predicting gene expression values at future time points and accordingly, (4) predicting drug effect over time. In this paper, we propose a two-stage methodology that is implemented in the software "gene network explorer (GNetXP)" for extracting GRNs from gene trajectory data. In the first stage, we apply a hybrid genetic algorithm and expectation maximization algorithm on clustering the large number of gene trajectories using the mixture of multiple linear regression models for fitting the trajectory data. In the second stage, we apply the Kalman filter to identify a set of first-order differential equations that describe the dynamics of the representative trajectories, and use these equations for discovering important gene interactions and predicting gene expression values at future time points. The proposed method is demonstrated on the human fibroblast response gene expression data.