Reverse engineering gene networks using singular value decomposition and robust regression

We propose a scheme to reverse-engineer gene networks on a genome-wide scale using a relatively small amount of gene expression data from microarray experiments. Our method is based on the empirical observation that such networks are typically large and sparse. It uses singular value decomposition to construct a family of candidate solutions and then uses robust regression to identify the solution with the smallest number of connections as the most likely solution. Our algorithm has O(log N) sampling complexity and O(N4) computational complexity. We test and validate our approach in a series of in numero experiments on model gene networks.

[1]  P. Haccou Mathematical Models of Biology , 2022 .

[2]  D. Eisenberg,et al.  Protein function in the post-genomic era , 2000, Nature.

[3]  Trey Ideker,et al.  Testing for Differentially-Expressed Genes by Maximum-Likelihood Analysis of Microarray Data , 2000, J. Comput. Biol..

[4]  D. Ruppert Robust Statistics: The Approach Based on Influence Functions , 1987 .

[5]  A. Mccarthy Development , 1996, Current Opinion in Neurobiology.

[6]  Araceli M. Huerta,et al.  From specific gene regulation to genomic networks: a global analysis of transcriptional regulation in Escherichia coli. , 1998, BioEssays : news and reviews in molecular, cellular and developmental biology.

[7]  W. Steiger,et al.  Least Absolute Deviations: Theory, Applications and Algorithms , 1984 .

[8]  P. Brown,et al.  DNA arrays for analysis of gene expression. , 1999, Methods in enzymology.

[9]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[10]  D. Eisenberg,et al.  A combined algorithm for genome-wide prediction of protein function , 1999, Nature.

[11]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[12]  U. Alon,et al.  Ordering Genes in a Flagella Pathway by Analysis of Expression Kinetics from Living Bacteria , 2001, Science.

[13]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[14]  Lennart Ljung,et al.  System Identification: Theory for the User , 1987 .

[15]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[16]  A. Barabasi,et al.  Lethality and centrality in protein networks , 2001, Nature.

[17]  Neal S. Holter,et al.  Dynamic modeling of gene expression data. , 2001, Proceedings of the National Academy of Sciences of the United States of America.

[18]  C. Jennison,et al.  Robust Statistics: The Approach Based on Influence Functions , 1987 .

[19]  James R. Knight,et al.  A comprehensive analysis of protein–protein interactions in Saccharomyces cerevisiae , 2000, Nature.

[20]  G. Yagil,et al.  On the relation between effector concentration and the rate of induced enzyme synthesis. , 1971, Biophysical journal.

[21]  R. Cuninghame-Green,et al.  Applied Linear Algebra , 1979 .

[22]  I. Barrodale,et al.  An Improved Algorithm for Discrete $l_1 $ Linear Approximation , 1973 .

[23]  Ian Barrodale,et al.  Algorithm 478: Solution of an Overdetermined System of Equations in the l1 Norm [F4] , 1974, Commun. ACM.

[24]  William H. Press,et al.  Numerical Recipes in FORTRAN - The Art of Scientific Computing, 2nd Edition , 1987 .

[25]  Stephen J. Wright Primal-Dual Interior-Point Methods , 1997, Other Titles in Applied Mathematics.

[26]  김삼묘,et al.  “Bioinformatics” 특집을 내면서 , 2000 .

[27]  J. Mesirov,et al.  Interpreting patterns of gene expression with self-organizing maps: methods and application to hematopoietic differentiation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[28]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[29]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[30]  D Haussler,et al.  Knowledge-based analysis of microarray gene expression data by using support vector machines. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[31]  William H. Press,et al.  Book-Review - Numerical Recipes in Pascal - the Art of Scientific Computing , 1989 .

[32]  S. Gygi,et al.  Correlation between Protein and mRNA Abundance in Yeast , 1999, Molecular and Cellular Biology.

[33]  G. Church,et al.  Identifying regulatory networks by combinatorial analysis of promoter elements , 2001, Nature Genetics.

[34]  M Wahde,et al.  Coarse-grained reverse engineering of genetic regulatory networks. , 2000, Bio Systems.

[35]  P. Brown,et al.  Exploring the metabolic and genetic control of gene expression on a genomic scale. , 1997, Science.

[36]  A. Arkin,et al.  Stochastic mechanisms in gene expression. , 1997, Proceedings of the National Academy of Sciences of the United States of America.

[37]  B. Morgan,et al.  Non-uniqueness and Inversions in Cluster Analysis , 1995 .

[38]  A. Wagner Robustness against mutations in genetic networks of yeast , 2000, Nature Genetics.

[39]  T. Ideker,et al.  A new approach to decoding life: systems biology. , 2001, Annual review of genomics and human genetics.

[40]  J. Collins,et al.  Construction of a genetic toggle switch in Escherichia coli , 2000, Nature.

[41]  William H. Press,et al.  Numerical recipes in FORTRAN (2nd ed.): the art of scientific computing , 1992 .

[42]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Edward M. Marcotte,et al.  The path not taken , 2001, Nature Biotechnology.

[44]  R. Albert,et al.  The large-scale organization of metabolic networks , 2000, Nature.