A model-based optimization framework for the inference of regulatory interactions using time-course DNA microarray expression data

BackgroundProteins are the primary regulatory agents of transcription even though mRNA expression data alone, from systems like DNA microarrays, are widely used. In addition, the regulation process in genetic systems is inherently non-linear in nature, and most studies employ a time-course analysis of mRNA expression. These considerations should be taken into account in the development of methods for the inference of regulatory interactions in genetic networks.ResultsWe use an S-system based model for the transcription and translation process. We propose an optimization-based regulatory network inference approach that uses time-varying data from DNA microarray analysis. Currently, this seems to be the only model-based method that can be used for the analysis of time-course "relative" expressions (expression ratios). We perform an analysis of the dynamic behavior of the system when the number of experimental samples available is varied, when there are different levels of noise in the data and when there are genes that are not considered by the experimenter. Our studies show that the principal factor affecting the ability of a method to infer interactions correctly is the similarity in the time profiles of some or all the genes. The less similar the profiles are to each other the easier it is to infer the interactions. We propose a heuristic method for resolving networks and show that it displays reasonable performance on a synthetic network. Finally, we validate our approach using real experimental data for a chosen subset of genes involved in the sporulation cascade of Bacillus anthracis. We show that the method captures most of the important known interactions between the chosen genes.ConclusionThe performance of any inference method for regulatory interactions between genes depends on the noise in the data, the existence of unknown genes affecting the network genes, and the similarity in the time profiles of some or all genes. Though subject to these issues, the inference method proposed in this paper would be useful because of its ability to infer important interactions, the fact that it can be used with time-course DNA microarray data and because it is based on a non-linear model of the process that explicitly accounts for the regulatory role of proteins.

[1]  Eberhard O Voit,et al.  Neural-network-based parameter estimation in S-system models of biological networks. , 2003, Genome informatics. International Conference on Genome Informatics.

[2]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[3]  M. Savageau Biochemical systems analysis. II. The steady-state solutions for an n-pool system using a power-law approximation. , 1969, Journal of theoretical biology.

[4]  Peter Craven,et al.  Smoothing noisy data with spline functions , 1978 .

[5]  Marcel J. T. Reinders,et al.  A Comparison of Genetic Network Models , 2000, Pacific Symposium on Biocomputing.

[6]  G. Hambraeus,et al.  Genome-wide survey of mRNA half-lives in Bacillus subtilis identifies extremely stable mRNAs , 2003, Molecular Genetics and Genomics.

[7]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[8]  T. Yamanaka,et al.  The TAO-Gen Algorithm for Identifying Gene Interaction Networks with Application to SOS Repair in E. coli , 2004, Environmental health perspectives.

[9]  M. Mann,et al.  Proteomics to study genes and genomes , 2000, Nature.

[10]  Eleftherios T. Papoutsakis,et al.  A comparative genomic view of clostridial sporulation and physiology , 2005, Nature Reviews Microbiology.

[11]  Christodoulos A. Floudas,et al.  Deterministic Global Optimization , 1990 .

[12]  Jonas S. Almeida,et al.  Decoupling dynamical systems for pathway identification from metabolic profiles , 2004, Bioinform..

[13]  Sanjay Mehrotra,et al.  A model-based optimization framework for the inference on gene regulatory networks from DNA array data , 2004, Bioinform..

[14]  C. R. Deboor,et al.  A practical guide to splines , 1978 .

[15]  Diego di Bernardo,et al.  Inference of gene regulatory networks and compound mode of action from time course gene expression profiles , 2006, Bioinform..

[16]  Feng-Sheng Wang,et al.  Evolutionary optimization with data collocation for reverse engineering of biological networks , 2005, Bioinform..

[17]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[18]  Satoru Miyano,et al.  Inferring qualitative relations in genetic networks and metabolic pathways , 2000, Bioinform..

[19]  Masaru Tomita,et al.  Dynamic modeling of genetic networks using genetic algorithm and S-system , 2003, Bioinform..

[20]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[21]  Dirk Husmeier,et al.  Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks , 2003, Bioinform..

[22]  Fang-Xiang Wu,et al.  Modeling Gene Expression from Microarray Expression Data with State-Space Equations , 2003, Pacific Symposium on Biocomputing.

[23]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[24]  Savageau Ma Rules for the evolution of gene circuitry. , 1998 .

[25]  M. Savageau Biochemical systems analysis. II. The steady-state solutions for an n-pool system using a power-law approximation. , 1969, Journal of theoretical biology.

[26]  Masahiro Okamoto,et al.  Development of a System for the Inference of Large Scale Genetic Networks , 2000, Pacific Symposium on Biocomputing.

[27]  L. Schrage Optimization Modeling With LINDO , 1997 .

[28]  Diego di Bernardo,et al.  Robust Identification of Large Genetic Networks , 2003, Pacific Symposium on Biocomputing.

[29]  M A Savageau Rules for the evolution of gene circuitry. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[30]  Matsumoto,et al.  Finding Genetic Network from Experiments by Weighted Network Model. , 1998, Genome informatics. Workshop on Genome Informatics.

[31]  J. Collins,et al.  Inferring Genetic Networks and Identifying Compound Mode of Action via Expression Profiling , 2003, Science.

[32]  Paul P. Wang,et al.  Advances to Bayesian network inference for generating causal networks from observational biological data , 2004, Bioinform..

[33]  Christodoulos A. Floudas,et al.  Deterministic global optimization - theory, methods and applications , 2010, Nonconvex optimization and its applications.

[34]  V Hatzimanikatis,et al.  Proteomics: Theoretical and Experimental Considerations , 1999, Biotechnology progress.

[35]  Eberhard O. Voit,et al.  Computational Analysis of Biochemical Systems: A Practical Guide for Biochemists and Molecular Biologists , 2000 .

[36]  V. Thorsson,et al.  Discovery of regulatory interactions through perturbation: inference and experimental design. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[37]  N. Bergman,et al.  Formation and Composition of the Bacillus anthracis Endospore , 2004, Journal of bacteriology.

[38]  Eberhard O Voit,et al.  Theoretical Biology and Medical Modelling Identification of Metabolic System Parameters Using Global Optimization Methods , 2022 .

[39]  Shinohara,et al.  A System to Find Genetic Networks Using Weighted Network Model. , 1999, Genome informatics. Workshop on Genome Informatics.

[40]  Åke Björck,et al.  Numerical methods for least square problems , 1996 .

[41]  WangFeng-Sheng,et al.  Evolutionary optimization with data collocation for reverse engineering of biological networks , 2005 .

[42]  Satoru Miyano,et al.  Estimation of Genetic Networks and Functional Structures Between Genes by Using Bayesian Networks and Nonparametric Regression , 2001, Pacific Symposium on Biocomputing.

[43]  A. Varshavsky,et al.  The N-end rule: functions, mysteries, uses. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[44]  E Terry Papoutsakis,et al.  A segmental nearest neighbor normalization and gene identification method gives superior results for DNA-array analysis , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[45]  Eberhard O. Voit,et al.  Canonical nonlinear modeling : S-system approach to understanding complexity , 1991 .

[46]  Madhukar S. Dasika,et al.  A Mixed Integer Linear Programming (MILP) Framework for Inferring Time Delay in Gene Regulatory Networks , 2004, Pacific Symposium on Biocomputing.

[47]  Ying Wang,et al.  Theoretical and computational studies of the glucose signaling pathways in yeast using global gene expression data , 2003, Biotechnology and bioengineering.