On the Detection of Gene Network Interconnections using Directed Mutual Information

In this paper, we suggest and validate a systematic method for inferring biological gene networks. So far, the identification of even a small portion of gene networks has been achieved by consensus over multiple cellular biology labs. A gene refers to the sequence of DNA that encodes a single protein. Proteins encoded by a gene can regulate other genes in the living cell, forming a complex network that determines cell growth, health, and disease. We view gene networks as dynamic systems, in discrete-time, formed by the interconnection among genes, which are abstracted as nodes whose state takes values in the range [-1, 1]. The state of each node is a function of the past values of the state of other nodes in the network. The edges of the gene network and their directions indicate functional dependence among the nodes state and their causality relationships, respectively. New engineering developments, such as quantum dot sensors, will allow measurement of gene dynamics inside living cells. From gene time-course data, we show how each edge in a gene network can be inferred using the concept of directed mutual information. We validated our method using small networks generated randomly, as well as for the known network for flagella biosynthesis in E.Coli, which we used to generate gene time-course data (with noise) in simulations. For acyclic graphs with 7 (or fewer) genes with summation operations only, we were able to infer all edges perfectly. We also present a heuristic method to deal with Boolean operations.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  Illtyd Trethowan Causality , 1938 .

[3]  Milburn Cl,et al.  The Walter Reed Army Institute of Research. , 1961 .

[4]  H. Marko,et al.  The Bidirectional Communication Theory - A Generalization of Information Theory , 1973, IEEE Transactions on Communications.

[5]  J. Davies,et al.  Molecular Biology of the Cell , 1983, Bristol Medico-Chirurgical Journal.

[6]  J. Massey CAUSALITY, FEEDBACK AND DIRECTED INFORMATION , 1990 .

[7]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[8]  Bernard Jacq,et al.  GIF-DB, a WWW database on gene interactions involved in Drosophila melanogaster development , 1997, Nucleic Acids Res..

[9]  Jérôme Euzenat,et al.  A Knowledge Base for D. melanogaster Gene Interactions Involved in Pattern Formation , 1997, ISMB.

[10]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[11]  R. Quatrano Genomics , 1998, Plant Cell.

[12]  Nikolay A. Kolchanov,et al.  GeneNet: a gene network database and its automated visualization , 1998, Bioinform..

[13]  Gerhard Kramer,et al.  Directed information for channels with feedback , 1998 .

[14]  Peter D. Karp,et al.  EcoCyc: Encyclopedia of Escherichia coli genes and metabolism , 1998, Nucleic Acids Res..

[15]  P. D’haeseleer,et al.  Mining the gene expression matrix: inferring gene relationships from large scale gene expression data , 1998 .

[16]  Julio Collado-Vides,et al.  RegulonDB (version 2.0): a database on transcriptional regulation in Escherichia coli , 1999, Nucleic Acids Res..

[17]  Hiroyuki Ogata,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 1999, Nucleic Acids Res..

[18]  D. Botstein,et al.  Exploring the new world of the genome with DNA microarrays , 1999, Nature Genetics.

[19]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[20]  E. Winzeler,et al.  Genomics, gene expression and DNA arrays , 2000, Nature.

[21]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[22]  S. Wodak,et al.  Representing and Analysing Molecular and Cellular Function Using the Computer , 2000, Biological chemistry.

[23]  Eberhard O. Voit,et al.  Computational Analysis of Biochemical Systems: A Practical Guide for Biochemists and Molecular Biologists , 2000 .

[24]  D. A. Baxter,et al.  Modeling transcriptional control in gene networks—methods, recent results, and future directions , 2000, Bulletin of mathematical biology.

[25]  H. V. Trees Detection, Estimation, And Modulation Theory , 2001 .

[26]  B. Alberts,et al.  Molecular Biology of the Cell (4th Ed) , 2002 .

[27]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[28]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[29]  H. Othmer,et al.  The topology of the regulatory interactions predicts the expression pattern of the segment polarity genes in Drosophila melanogaster. , 2003, Journal of theoretical biology.

[30]  John Quackenbush,et al.  Microarray gene expression data analysis - a beginner's guide , 2003 .

[31]  Uri Alon,et al.  Using a Quantitative Blueprint to Reprogram the Dynamics of the Flagella Gene Network , 2004, Cell.

[32]  Aniruddha Datta,et al.  Genomic signal processing: diagnosis and therapy , 2005, IEEE Signal Process. Mag..

[33]  Caroline Smith,et al.  Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine. , 2006, Genome research.

[34]  F. Bruggeman,et al.  Introduction to systems biology. , 2007, EXS.

[35]  Riccardo Dondi,et al.  Inferring (Biological) Signal Transduction Networks via Transitive Reductions of Directed Graphs , 2008, Algorithmica.

[36]  M. Melamed Detection , 2021, SETI: Astronomy as a Contact Sport.