Network Inference by Combining Biologically Motivated Regulatory Constraints with Penalized Regression

Reconstructing biomolecular networks from time series mRNA or protein abundance measurements is a central challenge in computational systems biology. The regulatory processes behind cellular responses are coupled and nonlinear, leading to rich dynamical behavior. One class of reconstruction algorithms uses regression and penalized regression to impose sparseness on the solution, as requested biologically. Motivated by the five‐gene challenge in the Dialogue for Reverse Engineering Assessments and Methods 2 (DREAM2) contest, we extend and test penalized regression schemes both on data from simulations and real qPCR measurements. The methods showing best performance are the Adaptive Ridge (AR) regression and a new extension thereof, in which we impose a biological constraint to the reconstructed network. Specifically, we request from the solutions that the outgoing links have the same regulatory sign, which is a reasonable approximation for most prokaryotic transcriptional networks. In other words, a given regulator must be either an activator or a repressor but not both. The constraints can be implemented with quadratic programming, and we show that this improves the reconstruction performance significantly. While linear models are not sufficiently general to encompass most complex behaviors, they offer powerful tools for network reconstruction, particularly for systems operating near a steady state. In particular, the optimization problems are well behaved and methodologies allow finding global optima efficiently. Adding constraints reflecting biological circuit designs is one of the most important aspects of network inference. We propose one such constraint, namely the consistency in the signs of outgoing links, which will facilitate the inference of transcriptional regulatory networks.

[1]  Julio Collado-Vides,et al.  RegulonDB (version 5.0): Escherichia coli K-12 transcriptional regulatory network, operon organization, and growth conditions , 2005, Nucleic Acids Res..

[2]  Chris Wiggins,et al.  ARACNE: An Algorithm for the Reconstruction of Gene Regulatory Networks in a Mammalian Cellular Context , 2004, BMC Bioinformatics.

[3]  J. Hasty,et al.  Reverse engineering gene networks: Integrating genetic perturbations with dynamical modeling , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[4]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[5]  Kwang-Hyun Cho,et al.  Least-squares methods for identifying biochemical regulatory networks from noisy measurements , 2007, BMC Bioinformatics.

[6]  Simon Rogers,et al.  A Bayesian regression approach to the inference of regulatory networks from gene expression data , 2005, Bioinform..

[7]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[8]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[9]  M. Elowitz,et al.  A synthetic oscillatory network of transcriptional regulators , 2000, Nature.

[10]  Zoubin Ghahramani,et al.  Modelling biological responses using gene expression profiling and linear dynamical systems , 2001 .

[11]  Adam Arkin,et al.  On the deduction of chemical reaction pathways from measurements of time series of concentrations. , 2001, Chaos.

[12]  Gene H. Golub,et al.  Tikhonov Regularization and Total Least Squares , 1999, SIAM J. Matrix Anal. Appl..

[13]  Eduardo D. Sontag,et al.  Inferring dynamic architecture of cellular networks using time series of gene expression, protein and metabolite data , 2004, Bioinform..

[14]  Andreas Wagner,et al.  How to reconstruct a large genetic network from n gene perturbations in fewer than n2 easy steps , 2001, Bioinform..

[15]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[16]  Edward R. Dougherty,et al.  Probabilistic Boolean networks: a rule-based uncertainty model for gene regulatory networks , 2002, Bioinform..

[17]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[18]  Yves Grandvalet,et al.  Outcomes of the Equivalence of Adaptive Ridge with Least Absolute Shrinkage , 1998, NIPS.

[19]  D. Botstein,et al.  Genomic expression programs in the response of yeast cells to environmental changes. , 2000, Molecular biology of the cell.

[20]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[21]  P. McSharry,et al.  Mathematical and computational techniques to deduce complex biochemical reaction mechanisms. , 2004, Progress in biophysics and molecular biology.

[22]  R. Laubenbacher,et al.  A computational algebra approach to the reverse engineering of gene regulatory networks. , 2003, Journal of theoretical biology.

[23]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[24]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..