Modeling Gene Expression from Microarray Expression Data with State-Space Equations

We describe a new method to model gene expression from time-course gene expression data. The modelling is in terms of state-space descriptions of linear systems. A cell can be considered to be a system where the behaviours (responses) of the cell depend completely on the current internal state plus any external inputs. The gene expression levels in the cell provide information about the behaviours of the cell. In previously proposed methods, genes were viewed as internal state variables of a cellular system and their expression levels were the values of the intemal state variables. This viewpoint has suffered from the underestimation of the model parameters. Instead, we view genes as the observation variables, whose expression values depend on the current intemal state variables and any external input. Factor analysis is used to identify the internal state variables, and Bayesian Information Criterion (BIC) is used to determine the number of the internal state variables. By building dynamic equations of the internal state variables and the relationships between the internal state variables and the observation variables (gene expression profiles), we get state-space descriptions of gene expression model. In the present method, model parameters may be unambiguously identified from time-course gene expression data. We apply the method to two time-course gene expression datasets to illustrate it.

[1]  Adrian E. Raftery,et al.  Model-based clustering and data transformations for gene expression data , 2001, Bioinform..

[2]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[3]  S. P. Fodor,et al.  Light-generated oligonucleotide arrays for rapid DNA sequence analysis. , 1994, Proceedings of the National Academy of Sciences of the United States of America.

[4]  B. Everitt,et al.  Applied Multivariate Data Analysis. , 1993 .

[5]  A. Raftery Choosing Models for Cross-Classifications , 1986 .

[6]  Debashis Ghosh,et al.  Mixture modelling of gene expression data from microarray experiments , 2002, Bioinform..

[7]  Dorothy T. Thayer,et al.  EM algorithms for ML factor analysis , 1982 .

[8]  J. Hopfield,et al.  From molecular to modular cell biology , 1999, Nature.

[9]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[10]  J. C. Gower,et al.  Factor Analysis as a Statistical Method. 2nd ed. , 1972 .

[11]  H. McAdams,et al.  Global analysis of the genetic network controlling a bacterial cell cycle. , 2000, Science.

[12]  Eric R. Ziegel,et al.  Applied Multivariate Data Analysis , 2002, Technometrics.

[13]  A. Basilevsky,et al.  Factor Analysis as a Statistical Method. , 1964 .

[14]  Roland Somogyi,et al.  Modeling the complexity of genetic networks: Understanding multigenic and pleiotropic regulation , 1996, Complex..

[15]  Satoru Miyano,et al.  Inferring Gene Regulatory Networks from Time-Ordered Gene Expression Data of Bacillus Subtilis Using Differential Equations , 2002, Pacific Symposium on Biocomputing.

[16]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[17]  Pierre Baldi,et al.  DNA Microarrays and Gene Expression - From Experiments to Data Analysis and Modeling , 2002 .

[18]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[19]  Chi-Tsong Chen,et al.  Linear System Theory and Design , 1995 .

[20]  Marcel J. T. Reinders,et al.  Linear Modeling of Genetic Networks from Experimental Data , 2000, ISMB.

[21]  David Botstein,et al.  The Stanford Microarray Database , 2001, Nucleic Acids Res..

[22]  William H. Press,et al.  The Art of Scientific Computing Second Edition , 1998 .

[23]  David R. Anderson,et al.  Model Selection and Inference: A Practical Information-Theoretic Approach , 2001 .

[24]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[25]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[26]  F. A. Seiler,et al.  Numerical Recipes in C: The Art of Scientific Computing , 1989 .

[27]  Geoffrey J. McLachlan,et al.  A mixture model-based approach to the clustering of microarray expression data , 2002, Bioinform..

[28]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[29]  Satoru Miyano,et al.  Identification of Genetic Networks from a Small Number of Gene Expression Patterns Under the Boolean Network Model , 1998, Pacific Symposium on Biocomputing.

[30]  Neal S. Holter,et al.  Dynamic modeling of gene expression data. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[31]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.