Revising regulatory networks: from expression data to linear causal models

Discovering the complex regulatory networks that govern mRNA expression is an important but difficult problem. Many current approaches use only expression data from microarrays to infer the likely network structure. However, this ignores much existing knowledge because for a given organism and system under study, a biologist may already have a partial model of gene regulation. We propose a method for revising and improving these initial models, which may be incomplete or partially incorrect, with expression data. We demonstrate our approach by revising a model of photosynthesis regulation proposed by a biologist for Cyanobacteria. Applied to wild type expression data, our system suggested several modifications consistent with biological knowledge. Applied to a mutant strain, our system correctly modified the disabled gene. Power experiments with synthetic data that indicate that reliable revision is feasible even with a small number of samples.

[1]  James H. Stock,et al.  Temporal aggregation and structural inference in macroeconomics a comment , 1987 .

[2]  Tommi S. Jaakkola,et al.  Combining Location and Expression Data for Principled Discovery of Genetic Regulatory Network Models , 2001, Pacific Symposium on Biocomputing.

[3]  Marcel J. T. Reinders,et al.  Linear Modeling of Genetic Networks from Experimental Data , 2000, ISMB.

[4]  Gregory F. Cooper,et al.  Discovery of Causal Relationships in a Gene-Regulation Pathway from a Mixture of Experimental and Observational DNA Microarray Data , 2001, Pacific Symposium on Biocomputing.

[5]  Satoru Miyano,et al.  Bayesian network and nonparametric heteroscedastic regression for nonlinear modeling of genetic network , 2003, Proceedings. IEEE Computer Society Bioinformatics Conference.

[6]  Satoru Miyano,et al.  Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression , 2001, Pacific Symposium on Biocomputing.

[7]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[8]  Dennis D. Murphy,et al.  Book review: Computational Models of Scientific Discovery and Theory Formation Edited by Jeff Shrager & Pat Langley (Morgan Kaufmann San Mateo, CA, 1990) , 1992, SGAR.

[9]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[10]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[11]  R. P. McDonald,et al.  Structural Equations with Latent Variables , 1989 .

[12]  J. Felsenstein CONFIDENCE LIMITS ON PHYLOGENIES: AN APPROACH USING THE BOOTSTRAP , 1985, Evolution; international journal of organic evolution.

[13]  J. Shaffer Multiple Hypothesis Testing , 1995 .

[14]  Nir Friedman,et al.  Data Analysis with Bayesian Networks: A Bootstrap Approach , 1999, UAI.

[15]  Robert Tibshirani,et al.  An Introduction to the Bootstrap , 1994 .

[16]  Marcus J. Chambers,et al.  Granger Causality and the Sampling of Economic Processes , 2006 .

[17]  H. Simon,et al.  Spurious Correlation: A Causal Interpretation* , 1954 .

[18]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[19]  R Scheines,et al.  The TETRAD Project: Constraint Based Aids to Causal Model Specification. , 1998, Multivariate behavioral research.

[20]  Arthur R. Grossman,et al.  Tracking the Light Environment by Cyanobacteria and the Dynamic Nature of Light Harvesting* , 2001, The Journal of Biological Chemistry.

[21]  P. Langley,et al.  Computational Models of Scientific Discovery and Theory Formation , 1990 .

[22]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[23]  Richard Scheines,et al.  Discovering Causal Structure: Artificial Intelligence, Philosophy of Science, and Statistical Modeling , 1987 .