A Maximum A Posteriori Probability and Time-Varying Approach for Inferring Gene Regulatory Networks from Time Course Gene Microarray Data

Unlike most conventional techniques with static model assumption, this paper aims to estimate the time-varying model parameters and identify significant genes involved at different timepoints from time course gene microarray data. We first formulate the parameter identification problem as a new maximum a posteriori probability estimation problem so that prior information can be incorporated as regularization terms to reduce the large estimation variance of the high dimensional estimation problem. Under this framework, sparsity and temporal consistency of the model parameters are imposed using L1-regularization and novel continuity constraints, respectively. The resulting problem is solved using the L-BFGS method with the initial guess obtained from the partial least squares method. A novel forward validation measure is also proposed for the selection of regularization parameters, based on both forward and current prediction errors. The proposed method is evaluated using a synthetic benchmark testing data and a publicly available yeast Saccharomyces cerevisiae cell cycle microarray data. For the latter particularly, a number of significant genes identified at different timepoints are found to be biological significant according to previous findings in biological experiments. These suggest that the proposed approach may serve as a valuable tool for inferring time-varying gene regulatory networks in biological studies.

[1]  H. Iba,et al.  Inferring Gene Regulatory Networks using Differential Evolution with Local Search Heuristics , 2007, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[2]  S. Gasser,et al.  Cell cycle-dependent phosphorylation of Rad53 kinase by Cdc5 and Cdc28 modulates checkpoint adaptation , 2010, Cell cycle.

[3]  Dario Floreano,et al.  Generating Realistic In Silico Gene Networks for Performance Assessment of Reverse Engineering Methods , 2009, J. Comput. Biol..

[4]  J. Mattick RNA regulation: a new genetics? , 2004, Nature Reviews Genetics.

[5]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[6]  Jill Duncan,et al.  Analyzing microarray data using cluster analysis. , 2003, Pharmacogenomics.

[7]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[8]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[9]  B. P. Duncker,et al.  ORC Function in Late G1: Maintaining the License for DNA Replication , 2007, Cell cycle.

[10]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[11]  Ronald W. Davis,et al.  A genome-wide transcriptional analysis of the mitotic cell cycle. , 1998, Molecular cell.

[12]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[13]  D. Bernardo,et al.  A Yeast Synthetic Network for In Vivo Assessment of Reverse-Engineering and Modeling Approaches , 2009, Cell.

[14]  Jie Xiong,et al.  A Kalman-Filter Based Approach to Identification of Time-Varying Gene Regulatory Networks , 2013, PloS one.

[15]  N. D. Clarke,et al.  Towards a Rigorous Assessment of Systems Biology Models: The DREAM3 Challenges , 2010, PloS one.

[16]  Shing-Chow Chan,et al.  A New Method for Preliminary Identification of Gene Regulatory Networks from Gene Microarray Cancer Data Using Ridge Partial Least Squares With Recursive Feature Elimination and Novel Brier and Occurrence Probability Measures , 2012, IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans.

[17]  Arye Nehorai,et al.  Estimating Sparse Gene Regulatory Networks Using a Bayesian Linear Regression , 2010, IEEE Transactions on NanoBioscience.

[18]  B. Futcher,et al.  Roles of the CDK phosphorylation sites of yeast Cdc6 in chromatin binding and rereplication. , 2007, Molecular biology of the cell.

[19]  H. Abdi Partial least squares regression and projection on latent structure regression (PLS Regression) , 2010 .

[20]  Fuwen Yang,et al.  Stochastic Dynamic Modeling of Short Gene Expression Time-Series Data , 2008, IEEE Transactions on NanoBioscience.

[21]  Tohru Mizushima,et al.  Analysis of origin recognition complex in saccharomyces cerevisiae by use of Degron mutants. , 2007, Journal of biochemistry.

[22]  Walter L. Smith Probability and Statistics , 1959, Nature.

[23]  Zidong Wang,et al.  An Extended Kalman Filtering Approach to Modeling Nonlinear Dynamic Gene Regulatory Networks via Short Gene Expression Time Series , 2009, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[24]  A. C. Koivunen,et al.  The Feasibility of Data Whitening to Improve Performance of Weather Radar , 1999 .

[25]  Zhong Su,et al.  Modelling genetic regulatory networks , 2008, 2008 Asia Simulation Conference - 7th International Conference on System Simulation and Scientific Computing.

[26]  D. Floreano,et al.  Revealing strengths and weaknesses of methods for gene network inference , 2010, Proceedings of the National Academy of Sciences.

[27]  D. Appling,et al.  Regulation of S-Adenosylmethionine Levels in Saccharomyces cerevisiae* , 2003, Journal of Biological Chemistry.

[28]  Jianqing Fan,et al.  Variable Selection via Nonconcave Penalized Likelihood and its Oracle Properties , 2001 .

[29]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[30]  Tom M. Mitchell,et al.  Continuous hidden process model for time series expression experiments , 2007, ISMB/ECCB.

[31]  T. Kepler,et al.  Stochasticity in transcriptional regulation: origins, consequences, and mathematical representations. , 2001, Biophysical journal.

[32]  Isabel M. Tienda-Luna,et al.  Reverse engineering gene regulatory networks , 2009, IEEE Signal Processing Magazine.

[33]  B. Sarcevic,et al.  Control of cell cycle progression by phosphorylation of cyclin-dependent kinase (CDK) substrates. , 2010, Bioscience reports.

[34]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[35]  U. Hjorth Model Selection and Forward Validation , 1982 .

[36]  Wotao Yin,et al.  Improved Iteratively Reweighted Least Squares for Unconstrained Smoothed 퓁q Minimization , 2013, SIAM J. Numer. Anal..

[37]  Devdatt P. Dubhashi,et al.  NETGEM: Network Embedded Temporal GEnerative Model for gene expression data , 2011, BMC Bioinformatics.

[38]  Suzana Loskovska,et al.  Bayesian networks application for representation and structure learning of gene regulatory networks , 2009, 2009 12th International Conference on Computers and Information Technology.

[39]  S. Henikoff,et al.  Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription , 2007, Nature Genetics.

[40]  A. Arkin,et al.  Stochastic mechanisms in gene expression. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[41]  K. Aihara,et al.  Chaos and asymptotical stability in discrete-time neural networks , 1997, chao-dyn/9701020.

[42]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[43]  John P. Doyle,et al.  CDC5 Inhibits the Hyperphosphorylation of the Checkpoint Kinase Rad53, Leading to Checkpoint Adaptation , 2010, PLoS biology.

[44]  E. Gelenbe,et al.  Reconstruction of Large-Scale Gene Regulatory Networks Using Bayesian Model Averaging , 2012, IEEE Transactions on NanoBioscience.

[45]  Hiroaki Kurokawa,et al.  Inferring method of the gene regulatory networks using neural networks adopting a majority rule , 2011, The 2011 International Joint Conference on Neural Networks.

[46]  Andrzej Kudlicki,et al.  High-resolution timing of cell cycle-regulated gene expression , 2007, Proceedings of the National Academy of Sciences.

[47]  Douglas M. Hawkins,et al.  A variance-stabilizing transformation for gene-expression microarray data , 2002, ISMB.

[48]  S. Wold,et al.  The Collinearity Problem in Linear Regression. The Partial Least Squares (PLS) Approach to Generalized Inverses , 1984 .

[49]  B. Wittmann-Liebold,et al.  Mitochondrial ribosomal proteins (MRPs) of yeast. , 1998, The Biochemical journal.

[50]  Tianhai Tian,et al.  Stochastic neural network models for gene regulatory networks , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..