Reconstruction of gene regulatory networks from postgenomic data

An important problem in systems biology is the inference of biochemical pathways and regulatory networks from postgenomic data. The recent substantial increase in the availability of such data has stimulated the interest in inferring the networks and pathways from the data themselves. The main interests of this thesis are the application, evaluation and the improvement of machine learning methods applied to the reverse engineering of biochemical pathways and networks. The thesis starts with the application of an established method to newly available gene expression data related to the interferon pathway of the human immune system in order to identify active subpathways under different experimental conditions. The thesis continues with the comparative evaluation of various machine learning methods (Relevance networks, Graphical Gaussian Models, Bayesian networks) using observational and interventional data from cytometry experiments as well as simulated data from a gold-standard network. The thesis also extends and improves existing methods to include biological prior knowledge under the Bayesian approach in order to increase the accuracy of the predicted networks and it quantifies to what extent the reconstruction accuracy can be improved in this way.

[1]  Wray L. Buntine Theory Refinement on Bayesian Networks , 1991, UAI.

[2]  F. Crick Central Dogma of Molecular Biology , 1970, Nature.

[3]  Kevin Murphy,et al.  Modelling Gene Expression Data using Dynamic Bayesian Networks , 2006 .

[4]  B. Carlin,et al.  Markov Chain Monte Carlo conver-gence diagnostics: a comparative review , 1996 .

[5]  Satoru Miyano,et al.  Using Protein-Protein Interactions for Refining Gene Networks Estimated from Microarray Data by Bayesian Networks , 2003, Pacific Symposium on Biocomputing.

[6]  David Heckerman,et al.  A Tutorial on Learning with Bayesian Networks , 1999, Innovations in Bayesian Networks.

[7]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[8]  W. K. Hastings,et al.  Monte Carlo Sampling Methods Using Markov Chains and Their Applications , 1970 .

[9]  Kevin P. Murphy,et al.  Learning the Structure of Dynamic Probabilistic Networks , 1998, UAI.

[10]  David Maxwell Chickering,et al.  Learning Equivalence Classes of Bayesian Network Structures , 1996, UAI.

[11]  V. Anne Smith,et al.  Evaluating functional network inference using simulations of complex biological systems , 2002, ISMB.

[12]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[13]  P. Shannon,et al.  Cytoscape: a software environment for integrated models of biomolecular interaction networks. , 2003, Genome research.

[14]  Olivier Ledoit,et al.  A well-conditioned estimator for large-dimensional covariance matrices , 2004 .

[15]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[16]  I S Kohane,et al.  Mutual information relevance networks: functional genomic clustering using pairwise entropy measurements. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[17]  Jin Tian,et al.  Causal Discovery from Changes: a Bayesian Approach , 2001, UAI 2001.

[18]  P. Gehler,et al.  An introduction to graphical models , 2001 .

[19]  Satoru Miyano,et al.  Estimating gene networks from gene expression data by combining Bayesian network model with promoter element detection , 2003, ECCB.

[20]  Lorenz Wernisch,et al.  Factor analysis for gene regulatory networks and transcription factor activity profiles , 2007, BMC Bioinformatics.

[21]  D. Botstein,et al.  Cluster analysis and display of genome-wide expression patterns. , 1998, Proceedings of the National Academy of Sciences of the United States of America.

[22]  Kevin P. Murphy,et al.  Bayesian structure learning using dynamic programming and MCMC , 2007, UAI.

[23]  F. Crick,et al.  Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid , 1974, Nature.

[24]  Laura Bonetta,et al.  Flow cytometry smaller and better , 2005, Nature Methods.

[25]  Korbinian Strimmer,et al.  An empirical Bayes approach to inferring large-scale gene association networks , 2005, Bioinform..

[26]  中尾 光輝,et al.  KEGG(Kyoto Encyclopedia of Genes and Genomes)〔和文〕 (特集 ゲノム医学の現在と未来--基礎と臨床) -- (データベース) , 2000 .

[27]  Nicola J. Rinaldi,et al.  Transcriptional regulatory code of a eukaryotic genome , 2004, Nature.

[28]  F. Crick On protein synthesis. , 1958, Symposia of the Society for Experimental Biology.

[29]  C. D. Gelatt,et al.  Optimization by Simulated Annealing , 1983, Science.

[30]  Isaac S. Kohane,et al.  Relevance Networks: A First Step Toward Finding Genetic Regulatory Networks Within Microarray Data , 2003 .

[31]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[32]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[33]  E. Davidson,et al.  Cis-regulatory logic in the endo16 gene: switching from a specification to a differentiation mode of control. , 2001, Development.

[34]  K. Sachs,et al.  Causal Protein-Signaling Networks Derived from Multiparameter Single-Cell Data , 2005, Science.

[35]  Ming Zhou,et al.  Regulation of Raf-1 by direct feedback phosphorylation. , 2005, Molecular cell.

[36]  S. Ying,et al.  Intron-derived microRNAs--fine tuning of gene functions. , 2004, Gene.

[37]  K. Strimmer,et al.  Statistical Applications in Genetics and Molecular Biology A Shrinkage Approach to Large-Scale Covariance Matrix Estimation and Implications for Functional Genomics , 2011 .

[38]  Dirk Husmeier,et al.  Sensitivity and specificity of inferring genetic regulatory interactions from microarray experiments with dynamic Bayesian networks , 2003, Bioinform..

[39]  Lorenz Wernisch,et al.  Reconstruction of gene networks using Bayesian learning and manipulation experiments , 2004, Bioinform..

[40]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[41]  Alexander J. Hartemink,et al.  Informative Structure Priors: Joint Learning of Dynamic Regulatory Networks from Multiple Types of Data , 2004, Pacific Symposium on Biocomputing.

[42]  Ronald W. Davis,et al.  Quantitative Monitoring of Gene Expression Patterns with a Complementary DNA Microarray , 1995, Science.

[43]  N. Bing,et al.  Genetical Genomics Analysis of a Yeast Segregant Population for Transcription Network Inference , 2005, Genetics.

[44]  M. Roederer,et al.  The history and future of the fluorescence activated cell sorter and flow cytometry: a view from Stanford. , 2002, Clinical chemistry.

[45]  E. Davidson,et al.  Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. , 1998, Science.

[46]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[47]  Paolo Giudici,et al.  Improving Markov Chain Monte Carlo Model Search for Data Mining , 2004, Machine Learning.

[48]  David Heckerman,et al.  Learning Gaussian Networks , 1994, UAI.

[49]  G. Briggs,et al.  A Note on the Kinetics of Enzyme Action. , 1925, The Biochemical journal.

[50]  Kathryn B. Laskey,et al.  Population Markov Chain Monte Carlo , 2004, Machine Learning.

[51]  Gregory F. Cooper,et al.  A Bayesian Method for the Induction of Probabilistic Networks from Data , 1992 .

[52]  N. Metropolis,et al.  Equation of State Calculations by Fast Computing Machines , 1953, Resonance.

[53]  J. Claverie Fewer Genes, More Noncoding RNA , 2005, Science.

[54]  Patrik D'haeseleer,et al.  Genetic network inference: from co-expression clustering to reverse engineering , 2000, Bioinform..

[55]  Satoru Miyano,et al.  Combining Microarrays and Biological Knowledge for Estimating Gene Networks via Bayesian Networks , 2004, J. Bioinform. Comput. Biol..

[56]  Chiara Sabatti,et al.  Bayesian sparse hidden components analysis for transcription regulation networks , 2005, Bioinform..

[57]  Dirk Husmeier,et al.  Bayesian integration of biological prior knowledge into the reconstruction of gene regulatory networks with Bayesian networks. , 2007, Computational systems bioinformatics. Computational Systems Bioinformatics Conference.

[58]  Nicola J. Rinaldi,et al.  Transcriptional Regulatory Networks in Saccharomyces cerevisiae , 2002, Science.

[59]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[60]  Walter R. Gilks,et al.  Strategies for improving MCMC , 1995 .

[61]  A. Kudlicki,et al.  Logic of the Yeast Metabolic Cycle: Temporal Compartmentalization of Cellular Processes , 2005, Science.

[62]  Satoru Miyano,et al.  Error tolerant model for incorporating biological knowledge with expression data in estimating gene networks , 2006 .

[63]  Alexander J. Hartemink,et al.  Principled computational methods for the validation discovery of genetic regulatory networks , 2001 .

[64]  M. Kanehisa A database for post-genome analysis. , 1997, Trends in genetics : TIG.

[65]  Satoru Miyano,et al.  Bayesian Network and Nonparametric Heteroscedastic Regression for Nonlinear Modeling of Genetic Network , 2003, J. Bioinform. Comput. Biol..

[66]  Marco Grzegorczyk,et al.  Comparative evaluation of reverse engineering gene regulatory networks with relevance networks, graphical gaussian models and bayesian networks , 2006, Bioinform..

[67]  Nir Friedman,et al.  Being Bayesian about Network Structure , 2000, UAI.

[68]  Marco Grzegorczyk,et al.  Comparative evaluation of different graphical models for the analysis of gene expression data , 2006 .

[69]  John P. Huelsenbeck,et al.  MRBAYES: Bayesian inference of phylogenetic trees , 2001, Bioinform..

[70]  D. Edwards Introduction to graphical modelling , 1995 .

[71]  A. Hill,et al.  The possible effects of the aggregation of the molecules of haemoglobin on its dissociation curves , 1910 .

[72]  C. Geyer Markov Chain Monte Carlo Maximum Likelihood , 1991 .

[73]  Helen Pearson,et al.  Genetics: What is a gene? , 2006, Nature.

[74]  Mtw,et al.  Computation, causation, and discovery , 2000 .

[75]  I Pournara,et al.  Reconstructing gene networks by passive and active Bayesian learning. , 2005 .

[76]  Judea Pearl,et al.  Equivalence and Synthesis of Causal Models , 1990, UAI.

[77]  Kiyoko F. Aoki-Kinoshita,et al.  From genomics to chemical genomics: new developments in KEGG , 2005, Nucleic Acids Res..

[78]  M. Muir Physical Chemistry , 1888, Nature.

[79]  D. Husmeier,et al.  Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge , 2007, Statistical applications in genetics and molecular biology.

[80]  Garry P. Nolan,et al.  Simultaneous measurement of multiple active kinase states using polychromatic flow cytometry , 2002, Nature Biotechnology.

[81]  Satoru Miyano,et al.  Utilizing Evolutionary Information and Gene Expression Data for Estimating Gene Networks with Bayesian Network Models , 2005, J. Bioinform. Comput. Biol..

[82]  Satoru Miyano,et al.  Estimating gene regulatory networks and protein-protein interactions of Saccharomyces cerevisiae from multiple genome-wide data , 2005, ECCB/JBI.

[83]  Benno Schwikowski,et al.  Discovering regulatory and signalling circuits in molecular interaction networks , 2002, ISMB.