Polynomial Models for Systems Biology: Data Discretization and Term Order Effect on Dynamics

Systems biology aims at system-level understanding of biological systems, in particular cellular networks. The milestones of this understanding are knowledge of the structure of the system, understanding of its dynamics, effective control methods, and powerful prediction capability. The complexity of biological systems makes it inevitable to consider mathematical modeling in order to achieve these goals. The enormous accumulation of experimental data representing the activities of the living cell has triggered an increasing interest in the reverse engineering of biological networks from data. In particular, construction of discrete models for reverse engineering of biological networks is receiving attention, with the goal of providing a coarse-grained description of such networks. In this dissertation we consider the modeling framework of polynomial dynamical systems over finite fields constructed from experimental data. We present and propose solutions to two problems inherent in this modeling method: the necessity of appropriate discretization of the data and the selection of a particular polynomial model from the set of all models that fit the data. Data discretization, also known as binning, is a crucial issue for the construction of discrete models of biological networks. Experimental data are however usually continuous, or, at least, represented by computer floating point numbers. A major challenge in discretizing biological data, such as those collected through microarray experiments, is the typically small samples size. Many methods for discretization are not applicable due to the insufficient amount of data. The method proposed in this work is a first attempt to develop a discretization tool that takes into consideration the issues and limitations that are inherent in short data time courses. Our focus is on the two characteristics that any discretization method should possess

[1]  H. Othmer,et al.  The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. , 2003, Journal of theoretical biology.

[2]  J. Lopreato,et al.  General system theory : foundations, development, applications , 1970 .

[3]  N. Bose Gröbner Bases: An Algorithmic Method in Polynomial Ideal Theory , 1995 .

[4]  Denis Thieffry,et al.  Genetic control of flower morphogenesis in Arabidopsis thaliana: a logical analysis , 1999, Bioinform..

[5]  Bruno Buchberger,et al.  Bruno Buchberger's PhD thesis 1965: An algorithm for finding the basis elements of the residue class ring of a zero dimensional polynomial ideal , 2006, J. Symb. Comput..

[6]  A I Saeed,et al.  TM4: a free, open-source system for microarray data management and analysis. , 2003, BioTechniques.

[7]  Stuart A. Kauffman,et al.  The origins of order , 1993 .

[8]  B. Biteau,et al.  Oxidative stress responses in yeast , 2003 .

[9]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[10]  R. Albert Boolean modeling of genetic regulatory networks , 2004 .

[11]  Albert,et al.  Topology of evolving networks: local events and universality , 2000, Physical review letters.

[12]  S. Kauffman Metabolic stability and epigenesis in randomly constructed genetic nets. , 1969, Journal of theoretical biology.

[13]  John von Neumann,et al.  Theory Of Self Reproducing Automata , 1967 .

[14]  K. Conrad,et al.  Finite Fields , 2018, Series and Products in the Development of Mathematics.

[15]  Master Gardener,et al.  Mathematical games: the fantastic combinations of john conway's new solitaire game "life , 1970 .

[16]  Michael Stillman,et al.  A theorem on refining division orders by the reverse lexicographic order , 1987 .

[17]  Donald Ervin Knuth,et al.  The Art of Computer Programming , 1968 .

[18]  Kelvin H. Lee,et al.  Dynamical analysis of gene networks requires both mRNA and protein expression information. , 1999, Metabolic engineering.

[19]  H. Michael Möller,et al.  Gröbner Bases and Applications: Gröbner Bases and Numerical Analysis , 1998 .

[20]  G. Briggs,et al.  A Note on the Kinetics of Enzyme Action. , 1925, The Biochemical journal.

[21]  D Thieffry,et al.  Qualitative analysis of gene networks. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[22]  M. Mead,et al.  Cybernetics , 1953, The Yale Journal of Biology and Medicine.

[23]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[24]  A. Giuliani,et al.  The main biological determinants of tumor line taxonomy elucidated by a principal component analysis of microarray data , 2001, FEBS letters.

[25]  J. Klaunig,et al.  The role of oxidative stress in carcinogenesis. , 2004, Annual review of pharmacology and toxicology.

[26]  N. Rashevsky,et al.  Mathematical biology , 1961, Connecticut medicine.

[27]  Ron Kohavi,et al.  Supervised and Unsupervised Discretization of Continuous Features , 1995, ICML.

[28]  David A. Cox,et al.  Ideals, Varieties, and Algorithms , 1997 .

[29]  Boon,et al.  Class of cellular automata for reaction-diffusion systems. , 1994, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[30]  Ralf Fröberg,et al.  An introduction to Gröbner bases , 1997, Pure and applied mathematics.

[31]  Shojiro Sakata Gröbner Bases and Applications: Gröbner Bases and Coding Theory , 1998 .

[32]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[33]  J. Watkins,et al.  Diabetes, oxidative stress, and antioxidants: A review , 2003, Journal of biochemical and molecular toxicology.

[34]  Martin Kreuzer,et al.  Computing Ideals of Points , 2000, J. Symb. Comput..

[35]  Joel E. Cohen,et al.  Mathematics Is Biology's Next Microscope, Only Better; Biology Is Mathematics' Next Physics, Only Better , 2004, PLoS biology.

[36]  R. Laubenbacher,et al.  A computational algebra approach to the reverse engineering of gene regulatory networks. , 2003, Journal of theoretical biology.

[37]  Donald E. Knuth,et al.  The Art of Computer Programming, Vol. 3: Sorting and Searching , 1974 .

[38]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[39]  Brandilyn Suzanne Stigler,et al.  An Algebraic Approach to Reverse Engineering with an Application to Biochemical Networks , 2005 .

[40]  Marcel J. T. Reinders,et al.  Studying the Conditions for Learning Dynamic Bayesian Networks to Discover Genetic Regulatory Networks , 2003, Simul..

[41]  J. Barker,et al.  Large-scale temporal gene expression mapping of central nervous system development. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[42]  Jesper Tegnér,et al.  Reverse engineering gene networks using singular value decomposition and robust regression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[43]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[44]  Pedro Mendes,et al.  GEPASI: a software package for modelling the dynamics, steady states and control of biochemical and other systems , 1993, Comput. Appl. Biosci..

[45]  Yi Jiang,et al.  On Cellular Automaton Approaches to Modeling Biological Cells , 2003, Mathematical Systems Theory in Biology, Communications, Computation, and Finance.

[46]  R. Thomas,et al.  Boolean formalization of genetic control circuits. , 1973, Journal of theoretical biology.

[47]  L. Glass,et al.  STEADY STATES, LIMIT CYCLES, AND CHAOS IN MODELS OF COMPLEX BIOLOGICAL NETWORKS , 1991 .

[48]  Douglas B. Kell,et al.  Non-linear optimization of biochemical pathways: applications to metabolic engineering and parameter estimation , 1998, Bioinform..

[49]  Ziv Bar-Joseph,et al.  STEM: a tool for the analysis of short time series gene expression data , 2006, BMC Bioinformatics.

[50]  S. Istrail,et al.  Inferring Gene Transcription Networks: The Davidson Model , 2002 .

[51]  L. Robbiano Gröbner Bases and Applications: Gröbner Bases and Statistics , 1998 .

[52]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[53]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[54]  Pedro Mendes,et al.  Artificial gene networks for objective comparison of analysis algorithms , 2003, ECCB.

[55]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[56]  Teo Mora,et al.  The Gröbner Fan of an Ideal , 1988, J. Symb. Comput..

[57]  W. Sha MICROARRAY DATA ANALYSIS METHODS AND THEIR APPLICATIONS TO GENE EXPRESSION DATA ANALYSIS FOR SACCHAROMYCES CEREVISIAE UNDER OXIDATIVE STRESS , 2006 .

[58]  Jerome K. Percus Mathematics of Genome Analysis , 2001 .

[59]  Winfried Just Reverse Engineering Discrete Dynamical Systems from Data Sets with Random Input Vectors , 2006, J. Comput. Biol..

[60]  Alexander J. Hartemink,et al.  Principled computational methods for the validation discovery of genetic regulatory networks , 2001 .

[61]  B. P. Yu,et al.  Cellular defenses against damage from reactive oxygen species. , 1994, Physiological reviews.

[62]  Donald E. Knuth,et al.  The Art of Computer Programming: Volume 3: Sorting and Searching , 1998 .

[63]  D J Jamieson,et al.  Oxidative stress responses of the yeast Saccharomyces cerevisiae , 1998, Yeast.

[64]  D. Thieffry,et al.  A logical analysis of the Drosophila gap-gene system. , 2001, Journal of theoretical biology.

[65]  Bruno Buchberger,et al.  The Construction of Multivariate Polynomials with Preassigned Zeros , 1982, EUROCAM.

[66]  Stephen Wolfram,et al.  Cellular automata as simple self-organizing systems , 1982 .

[67]  Y. Christen,et al.  Oxidative stress and Alzheimer disease. , 2000, The American journal of clinical nutrition.