Genomic signal processing: from matrix algebra to genetic networks.

DNA microarrays make it possible, for the first time, to record the complete genomic signals that guide the progression of cellular processes. Future discovery in biology and medicine will come from the mathematical modeling of these data, which hold the key to fundamental understanding of life on the molecular level, as well as answers to questions regarding diagnosis, treatment, and drug development. This chapter reviews the first data-driven models that were created from these genome-scale data, through adaptations and generalizations of mathematical frameworks from matrix algebra that have proven successful in describing the physical world, in such diverse areas as mechanics and perception: the singular value decomposition model, the generalized singular value decomposition model comparative model, and the pseudoinverse projection integrative model. These models provide mathematical descriptions of the genetic networks that generate and sense the measured data, where the mathematical variables and operations represent biological reality. The variables, patterns uncovered in the data, correlate with activities of cellular elements such as regulators or transcription factors that drive the measured signals and cellular states where these elements are active. The operations, such as data reconstruction, rotation, and classification in subspaces of selected patterns, simulate experimental observation of only the cellular programs that these patterns represent. These models are illustrated in the analyses of RNA expression data from yeast and human during their cell cycle programs and DNA-binding data from yeast cell cycle transcription factors and replication initiation proteins. Two alternative pictures of RNA expression oscillations during the cell cycle that emerge from these analyses, which parallel well-known designs of physical oscillators, convey the capacity of the models to elucidate the design principles of cellular systems, as well as guide the design of synthetic ones. In these analyses, the power of the models to predict previously unknown biological principles is demonstrated with a prediction of a novel mechanism of regulation that correlates DNA replication initiation with cell cycle-regulated RNA transcription in yeast. These models may become the foundation of a future in which biological systems are modeled as physical systems are today.

[1]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[2]  Sophie Palmer,et al.  Genetic Analysis of Completely Sequenced Disease-Associated MHC Haplotypes Identifies Shuffling of Segments in Recent Human History , 2006, PLoS genetics.

[3]  G. P. King,et al.  Extracting qualitative dynamics from experimental data , 1986 .

[4]  I. Prigogine,et al.  Fluctuations in nonequilibrium systems. , 1971, Proceedings of the National Academy of Sciences of the United States of America.

[5]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[6]  J. Davies,et al.  Molecular Biology of the Cell , 1983, Bristol Medico-Chirurgical Journal.

[7]  P. Green,et al.  Analysis of expressed sequence tags indicates 35,000 human genes , 2000, Nature Genetics.

[8]  Jan Ihmels,et al.  Principles of transcriptional control in the metabolic network of Saccharomyces cerevisiae , 2004, Nature Biotechnology.

[9]  M. Turk,et al.  Eigenfaces for Recognition , 1991, Journal of Cognitive Neuroscience.

[10]  G. Church,et al.  Systematic determination of genetic network architecture , 1999, Nature Genetics.

[11]  Nicola J. Rinaldi,et al.  Serial Regulation of Transcriptional Regulators in the Yeast Cell Cycle , 2001, Cell.

[12]  H. Bussemaker,et al.  Regulatory element detection using correlation with expression , 2001, Nature Genetics.

[13]  M. Rose,et al.  Kar4p, a karyogamy-specific component of the yeast pheromone response pathway , 1996, Molecular and cellular biology.

[14]  J. Liao,et al.  A synthetic gene–metabolic oscillator , 2005, Nature.

[15]  B. Stillman,et al.  Genomic Views of Genome Duplication , 2001, Science.

[16]  O. Lutz,et al.  The magnetic moment of 207Pb and the shielding of lead ions by water , 1971 .

[17]  Gene H. Golub,et al.  Matrix computations , 1983 .

[18]  T. Hughes,et al.  Signaling and circuitry of multiple MAPK pathways revealed by a matrix of global gene expression profiles. , 2000, Science.

[19]  Robert R Klevecz,et al.  A rapid genome-scale response of the transcriptional oscillator to perturbation reveals a period-doubling path to phenotypic change , 2006, Proceedings of the National Academy of Sciences.

[20]  Gene H Golub,et al.  Reconstructing the pathways of a cellular system from genome-scale signals by using matrix and tensor computations. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[21]  B. Palsson,et al.  The underlying pathway structure of biochemical reaction networks. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  H. Barlow,et al.  Single Units and Sensation: A Neuron Doctrine for Perceptual Psychology? , 1972, Perception.

[23]  David J. Field,et al.  Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.

[24]  Terrence J. Sejnowski,et al.  The “independent components” of natural scenes are edge filters , 1997, Vision Research.

[25]  D. Botstein,et al.  Generalized singular value decomposition for comparative analysis of genome-scale expression data sets of two different organisms , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[26]  David Botstein,et al.  The Stanford Microarray Database , 2001, Nucleic Acids Res..

[27]  Aleksey A. Nakorchevskiy,et al.  Expression deconvolution: A reinterpretation of DNA microarray data reveals dynamic changes in cell populations , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[28]  D. Botstein,et al.  For Personal Use. Only Reproduce with Permission from the Lancet Publishing Group , 2022 .

[29]  H. McAdams,et al.  Circuit simulation of genetic networks. , 1995, Science.

[30]  E. Koonin,et al.  A minimal gene set for cellular life derived by comparison of complete bacterial genomes. , 1996, Proceedings of the National Academy of Sciences of the United States of America.

[31]  I. Newton,et al.  The Principia : Mathematical Principles of Natural Philosophy , 2018 .

[32]  Gene H. Golub,et al.  NOVEL GENOME-SCALE CORRELATION BETWEEN DNA REPLICATION AND RNA TRANSCRIPTION DURING THE CELL CYCLE IN YEAST IS PREDICTED BY DATA-DRIVEN MODELS , 2004 .

[33]  H B Barlow,et al.  Single units and sensation: a neuron doctrine for perceptual psychology? , 1972, Perception.

[34]  J. Ross,et al.  Computational functions in biochemical reaction networks. , 1994, Biophysical journal.

[35]  M. Elowitz,et al.  A synthetic oscillatory network of transcriptional regulators , 2000, Nature.

[36]  J M Carlson,et al.  Highly optimized tolerance: a mechanism for power laws in designed systems. , 1999, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[37]  John J. Wyrick,et al.  Genome-Wide Distribution of ORC and MCM Proteins in S. cerevisiae: High-Resolution Mapping of Replication Origins , 2001, Science.

[38]  J J Hopfield,et al.  Odor space and olfactory processing: collective algorithms and neural implementation. , 1999, Proceedings of the National Academy of Sciences of the United States of America.

[39]  S. P. Fodor,et al.  Multiplexed biochemical assays with biological chips , 1993, Nature.

[40]  D. Murray,et al.  A genomewide oscillation in transcription gates DNA replication and cell cycle. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[41]  S. Bergmann,et al.  Similarities and Differences in Genome-Wide Expression Data of Six Organisms , 2003, PLoS biology.

[42]  Adam Arkin,et al.  Response experiments for nonlinear systems with application to reaction kinetics and genetics. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[43]  D. Hubel,et al.  Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.

[44]  B. Tye,et al.  Mcm1 Binds Replication Origins* , 2003, The Journal of Biological Chemistry.

[45]  C. Ball,et al.  Identification of genes periodically expressed in the human cell cycle and their expression in tumors. , 2002, Molecular biology of the cell.

[46]  B. Tye,et al.  Genome-Wide Hierarchy of Replication Origin Usage in Saccharomyces cerevisiae , 2006, PLoS genetics.

[47]  T. Kelly,et al.  Regulation of chromosome replication. , 2000, Annual review of biochemistry.

[48]  Kara Dolinski,et al.  Saccharomyces Genome Database (SGD) provides secondary gene annotation using the Gene Ontology (GO) , 2002, Nucleic Acids Res..

[49]  David Botstein,et al.  Processing and modeling genome-wide expression data using singular value decomposition , 2001, SPIE BiOS.

[50]  Joshua M. Stuart,et al.  A Gene-Coexpression Network for Global Discovery of Conserved Genetic Modules , 2003, Science.

[51]  O. Rössler An equation for continuous chaos , 1976 .

[52]  H. Swinney,et al.  Observation of a strange attractor , 1983 .

[53]  D. Botstein,et al.  Singular value decomposition for genome-wide expression data processing and modeling. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[54]  O. Alter Discovery of principles of nature from mathematical modeling of DNA microarray data , 2006, Proceedings of the National Academy of Sciences.

[55]  L Sirovich,et al.  Low-dimensional procedure for the characterization of human faces. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[56]  Gene H Golub,et al.  Integrative analysis of genome-scale data by using pseudoinverse projection predicts novel correlation between DNA replication and RNA transcription. , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[57]  K. Nasmyth,et al.  Yeast origin recognition complex is involved in DNA replication and transcriptional silencing , 1993, Nature.

[58]  J. Diffley,et al.  Two steps in the assembly of complexes at yeast replication origins in vivo , 1994, Cell.

[59]  G. Stein,et al.  Multivariable feedback design: Concepts for a classical/modern synthesis , 1981 .

[60]  J. Rine,et al.  Influences of the cell cycle on silencing. , 1996, Current opinion in cell biology.

[61]  E. Wigner The Unreasonable Effectiveness of Mathematics in the Natural Sciences (reprint) , 1960 .

[62]  Gene H Golub,et al.  Singular value decomposition of genome-scale mRNA lengths distribution reveals asymmetry in RNA gel electrophoresis band broadening , 2006, Proceedings of the National Academy of Sciences.

[63]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[64]  Jason A. Papin,et al.  Analysis of metabolic capabilities using singular value decomposition of extreme pathway matrices. , 2003, Biophysical journal.

[65]  Jonathan R. Pollack,et al.  Characterizing the physical genome , 2002, Nature Genetics.