Discovering Gene–Gene and Gene–Environment Causal Interactions Using Bioinformatics Approaches

The exponential growth of molecular biology data led to an intense focus on the study of interactions among DNA, RNA, protein biosynthesis, and environment. Using advanced genomics revolution tools, large datasets generated by the gene expression profiling experiments and next generation sequences technologies enable us to link molecular states and environmental effects to physiological states through the reverse engineering of gene–gene and gene–environment interaction networks that sense DNA and environmental perturbations. This will ultimately let us understand variations in physiological states associated with disease. In this chapter we review different mathematical and statistical bioinformatics approaches to discover and model gene–gene and gene–environment causal interactions. We also present new additional modeling methods in probabilistic networks to incorporate various interventions to perturb the system.

[1]  Gregory F. Cooper,et al.  Causal Discovery from a Mixture of Experimental and Observational Data , 1999, UAI.

[2]  Emma Steele,et al.  Literature-based priors for gene regulatory networks , 2009, Bioinform..

[3]  Roland Somogyi,et al.  Modeling the complexity of genetic networks: Understanding multigenic and pleiotropic regulation , 1996, Complex..

[4]  Jingyuan Fu,et al.  Mapping Determinants of Gene Expression Plasticity by Genetical Genomics in C. elegans , 2006, PLoS genetics.

[5]  Susumu Goto,et al.  KEGG: Kyoto Encyclopedia of Genes and Genomes , 2000, Nucleic Acids Res..

[6]  D. DeMets,et al.  Fundamentals of Clinical Trials , 1982 .

[7]  Jorge Alberto Achcar,et al.  Use of bayesian analysis to design of clinical trials with one treatment , 1984 .

[8]  Pardis C Sabeti,et al.  Genome-wide detection and characterization of positive selection in human populations , 2007, Nature.

[9]  Erik M. Brilz,et al.  The Five‐Gene‐Network Data Analysis with Local Causal Discovery Algorithm Using Causal Bayesian Networks , 2009, Annals of the New York Academy of Sciences.

[10]  G. Parmigiani,et al.  Core Signaling Pathways in Human Pancreatic Cancers Revealed by Global Genomic Analyses , 2008, Science.

[11]  Roger E Bumgarner,et al.  Integrated genomic and proteomic analyses of a systematically perturbed metabolic network. , 2001, Science.

[12]  Jean-François Boisvieux,et al.  Modelling behavioral syndromes using Bayesian networks , 1998, Artif. Intell. Medicine.

[13]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[14]  David J. Spiegelhalter,et al.  Bayesian Approaches to Randomized Trials , 1994, Bayesian Biostatistics.

[15]  H. McAdams,et al.  Circuit simulation of genetic networks. , 1995, Science.

[16]  R. Brooks,et al.  On the design of comparative lifetime studies , 1987 .

[17]  G S Michaels,et al.  Cluster analysis and data visualization of large-scale gene expression data. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[18]  D. A. Baxter,et al.  Modeling transcriptional control in gene networks—methods, recent results, and future directions , 2000, Bulletin of mathematical biology.

[19]  Masaru Tomita,et al.  E-CELL: software environment for whole-cell simulation , 1999, Bioinform..

[20]  Richard Sylvester A bayesian approach to the design of phase II clinical trials. , 1988 .

[21]  Gregory F. Cooper,et al.  An evaluation of a system that recommends microarray experiments to perform to discover gene-regulation pathways , 2004, Artif. Intell. Medicine.

[22]  A. Caspi,et al.  Influence of Life Stress on Depression: Moderation by a Polymorphism in the 5-HTT Gene , 2003, Science.

[23]  Richard Scheines,et al.  Constructing Bayesian Network Models of Gene Expression Networks from Microarray Data , 2000 .

[24]  E. Davidson,et al.  Genomic cis-regulatory logic: experimental and computational analysis of a sea urchin gene. , 1998, Science.

[25]  L. Glass,et al.  Chaos in high-dimensional neural and gene networks , 1996 .

[26]  J. V. Moran,et al.  Initial sequencing and analysis of the human genome. , 2001, Nature.

[27]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[28]  Mtw,et al.  Computation, causation, and discovery , 2000 .

[29]  Giordano Lanzola,et al.  Flexible guideline-based patient careflow systems , 2001, Artif. Intell. Medicine.

[30]  Michael Ruogu Zhang,et al.  Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. , 1998, Molecular biology of the cell.

[31]  Paolo Boffetta,et al.  Epigenetic Changes in Cancer: Role of Environment , 2010 .

[32]  James R. Knight,et al.  Genome sequencing in microfabricated high-density picolitre reactors , 2005, Nature.

[33]  David Lindley,et al.  Bayesian Statistics, a Review , 1987 .

[34]  P. Benfey,et al.  From Genotype to Phenotype: Systems Biology Meets Natural Variation , 2008, Science.

[35]  Dennis D. Murphy,et al.  Book review: Computational Models of Scientific Discovery and Theory Formation Edited by Jeff Shrager & Pat Langley (Morgan Kaufmann San Mateo, CA, 1990) , 1992, SGAR.

[36]  Lilienfeld Am,et al.  The Fielding H. Garrison Lecture: Ceteris paribus: the evolution of the clinical trial. , 1982, Bulletin of the history of medicine.

[37]  Gary D. Stormo,et al.  Modeling Regulatory Networks with Weight Matrices , 1998, Pacific Symposium on Biocomputing.

[38]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[39]  D. Busam,et al.  An Integrated Genomic Analysis of Human Glioblastoma Multiforme , 2008, Science.

[40]  Hidde de Jong,et al.  Modeling and Simulation of Genetic Regulatory Systems: A Literature Review , 2002, J. Comput. Biol..

[41]  H. Kitano Systems Biology: A Brief Overview , 2002, Science.

[42]  Peter Walter,et al.  Prm1p, a Pheromone-Regulated Multispanning Membrane Protein, Facilitates Plasma Membrane Fusion during Yeast Mating , 2000, The Journal of cell biology.

[43]  S. P. Fodor,et al.  High density synthetic oligonucleotide arrays , 1999, Nature Genetics.

[44]  El Houssine Snoussi,et al.  Logical identification of all steady states: The concept of feedback loop characteristic states , 1993 .

[45]  R. A. Fisher,et al.  Design of Experiments , 1936 .

[46]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[47]  K. Chaloner,et al.  Bayesian Experimental Design: A Review , 1995 .

[48]  Nir Friedman,et al.  Inferring subnetworks from perturbed expression profiles , 2001, ISMB.

[49]  Rachel B. Brem,et al.  Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks , 2008, Nature Genetics.

[50]  Atul J. Butte,et al.  Extracting Knowledge from Dynamics in Gene Expression , 2001, J. Biomed. Informatics.

[51]  Ka Yee Yeung,et al.  Algorithms for choosing differential gene expression experiments , 1999, RECOMB.

[52]  Barend Mons,et al.  Assignment of protein function and discovery of novel nucleolar proteins based on automatic analysis of MEDLINE , 2007, Proteomics.

[53]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[54]  Philippa J Talmud,et al.  Gene-environment interaction and its impact on coronary heart disease risk. , 2007, Nutrition, metabolism, and cardiovascular diseases : NMCD.

[55]  Ash A. Alizadeh,et al.  Distinct types of diffuse large B-cell lymphoma identified by gene expression profiling , 2000, Nature.

[56]  P. Spirtes,et al.  Causation, prediction, and search , 1993 .

[57]  William D. Dupont,et al.  Power and sample size calculations. A review and computer program. , 1990, Controlled clinical trials.

[58]  Steven Skiena,et al.  Identifying gene regulatory networks from experimental data , 2001, Parallel Comput..

[59]  D. Heckerman,et al.  A Bayesian Approach to Causal Discovery , 2006 .

[60]  David Heckerman,et al.  A Bayesian Approach to Learning Causal Networks , 1995, UAI.

[61]  H Matsuno,et al.  Hybrid Petri net representation of gene regulatory network. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[62]  Thomas Mestl,et al.  A methodological basis for description and analysis of systems with complex switch-like interactions , 1998, Journal of mathematical biology.

[63]  Jean Peccoud,et al.  Analysis of the Stabilizing Effect of ROM on the Genetic Network Controlling ColE1 Plasmid Replication , 1999, Pacific Symposium on Biocomputing.

[64]  Jason H. Moore,et al.  Multifactor dimensionality reduction software for detecting gene-gene and gene-environment interactions , 2003, Bioinform..

[65]  G. Getz,et al.  Coupled two-way clustering analysis of gene microarray data. , 2000, Proceedings of the National Academy of Sciences of the United States of America.

[66]  Ting Chen,et al.  Modeling Gene Expression with Differential Equations , 1998, Pacific Symposium on Biocomputing.

[67]  Peter J. F. Lucas,et al.  A probabilistic and decision-theoretic approach to the management of infectious disease at the ICU , 2000, Artif. Intell. Medicine.

[68]  J. Ross,et al.  A Test Case of Correlation Metric Construction of a Reaction Pathway from Measurements , 1997 .

[69]  E. Lakatos,et al.  Sample sizes based on the log-rank statistic in complex clinical trials. , 1988, Biometrics.

[70]  Kevin Murphy,et al.  Modelling Gene Expression Data using Dynamic Bayesian Networks , 2006 .

[71]  C. Ulrich,et al.  Colorectal adenomas and the C677T MTHFR polymorphism: evidence for gene-environment interaction? , 1999, Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology.

[72]  L. Kruglyak,et al.  Gene–Environment Interaction in Yeast Gene Expression , 2008, PLoS biology.

[73]  John R. Koza,et al.  Reverse Engineering of Metabolic Pathways from Observed Data Using Genetic Programming , 2000, Pacific Symposium on Biocomputing.

[74]  S Fuhrman,et al.  Reveal, a general reverse engineering algorithm for inference of genetic network architectures. , 1998, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[75]  Duccio Cavalieri,et al.  Genome-wide scan reveals that genetic variation for transcriptional plasticity in yeast is biased towards multi-copy and dispensable genes. , 2006, Gene.

[76]  Patrik D'haeseleer,et al.  Linear Modeling of mRNA Expression Levels During CNS Development and Injury , 1998, Pacific Symposium on Biocomputing.

[77]  B. Goodwin Oscillatory behavior in enzymatic control processes. , 1965, Advances in enzyme regulation.

[78]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[79]  C. Müller,et al.  Large-scale clustering of cDNA-fingerprinting data. , 1999, Genome research.

[80]  Suzanne M. Paley,et al.  Integrated pathway/genome databases and their role in drug discovery , 1999 .

[81]  Gregory F. Cooper,et al.  INKBLOT: A neurological diagnostic decision support system integrating causal and anatomical knowledge , 1997, Artif. Intell. Medicine.

[82]  T. Ideker,et al.  A new approach to decoding life: systems biology. , 2001, Annual review of genomics and human genetics.

[83]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[84]  J. Griffith Mathematics of cellular control processes. II. Positive feedback to one gene. , 1968, Journal of theoretical biology.

[85]  Marcel J. T. Reinders,et al.  A Comparison of Genetic Network Models , 2000, Pacific Symposium on Biocomputing.

[86]  Daphne Koller,et al.  Active Learning for Structure in Bayesian Networks , 2001, IJCAI.

[87]  Gregory F. Cooper,et al.  A Bayesian method for the induction of probabilistic networks from data , 1992, Machine Learning.

[88]  V. Thorsson,et al.  Discovery of regulatory interactions through perturbation: inference and experimental design. , 1999, Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing.

[89]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[90]  A. Valencia,et al.  Mining functional information associated with expression arrays , 2001, Functional & Integrative Genomics.

[91]  Satoru Miyano,et al.  Identification of Genetic Networks from a Small Number of Gene Expression Patterns Under the Boolean Network Model , 1998, Pacific Symposium on Biocomputing.

[92]  Tommi S. Jaakkola,et al.  Using Graphical Models and Genomic Expression Data to Statistically Validate Models of Genetic Regulatory Networks , 2000, Pacific Symposium on Biocomputing.

[93]  Javed Mostafa,et al.  Detecting Gene Relations from MEDLINE Abstracts , 2000, Pacific Symposium on Biocomputing.

[94]  G. Churchill,et al.  Experimental design for gene expression microarrays. , 2001, Biostatistics.

[95]  Catherine Garbay,et al.  A Society of Goal-Oriented Agents for the Analysis of Living Cells , 1997, AIME.