A Bayesian Framework for Inference of the Genotype–Phenotype Map for Segregating Populations

Complex genetic interactions lie at the foundation of many diseases. Understanding the nature of these interactions is critical to developing rational intervention strategies. In mammalian systems hypothesis testing in vivo is expensive, time consuming, and often restricted to a few physiological endpoints. Thus, computational methods that generate causal hypotheses can help to prioritize targets for experimental intervention. We propose a Bayesian statistical method to infer networks of causal relationships among genotypes and phenotypes using expression quantitative trait loci (eQTL) data from genetically randomized populations. Causal relationships between network variables are described with hierarchical regression models. Prior distributions on the network structure enforce graph sparsity and have the potential to encode prior biological knowledge about the network. An efficient Monte Carlo method is used to search across the model space and sample highly probable networks. The result is an ensemble of networks that provide a measure of confidence in the estimated network topology. These networks can be used to make predictions of system-wide response to perturbations. We applied our method to kidney gene expression data from an MRL/MpJ × SM/J intercross population and predicted a previously uncharacterized feedback loop in the local renin–angiotensin system.

[1]  Rory A. Fisher,et al.  The Arrangement of Field Experiments , 1992 .

[2]  Yang Li,et al.  Critical reasoning on causal inference in genome-wide linkage and association studies. , 2010, Trends in genetics : TIG.

[3]  Kenneth S. Koblan,et al.  Uncovering the Genetic Landscape for Multiple Sleep-Wake Traits , 2009, PloS one.

[4]  David Heckerman,et al.  Bayesian Networks for Data Mining , 2004, Data Mining and Knowledge Discovery.

[5]  Manjunatha Jagalur,et al.  Causal inference of regulator-target pairs by gene mapping of expression phenotypes , 2005, BMC Genomics.

[6]  Alex E. Lash,et al.  Gene Expression Omnibus: NCBI gene expression and hybridization array data repository , 2002, Nucleic Acids Res..

[7]  J. Zhu,et al.  An integrative genomics approach to the reconstruction of gene networks in segregating populations , 2004, Cytogenetic and Genome Research.

[8]  Stefan Van Dongen,et al.  Prior specification in Bayesian statistics: three cautionary tales. , 2006, Journal of theoretical biology.

[9]  Sunduz Keles,et al.  Statistical Applications in Genetics and Molecular Biology Supervised Detection of Conserved Motifs in DNA Sequences with Cosmo , 2011 .

[10]  M. Rockman,et al.  Reverse engineering the genotype–phenotype map with natural genetic variation , 2008, Nature.

[11]  D. Husmeier,et al.  Reconstructing Gene Regulatory Networks with Bayesian Networks by Combining Expression Data with Multiple Sources of Prior Knowledge , 2007, Statistical applications in genetics and molecular biology.

[12]  Marco Grzegorczyk,et al.  Improving the structure MCMC sampler for Bayesian networks by introducing a new edge reversal move , 2008, Machine Learning.

[13]  Keith Shockley,et al.  Structural Model Analysis of Multiple Quantitative Traits , 2006, PLoS genetics.

[14]  Jari P. Kaipio,et al.  Aristotelian prior boundary conditions , 2006 .

[15]  Peter Green,et al.  Markov chain Monte Carlo in Practice , 1996 .

[16]  P. Gustafson,et al.  Conservative prior distributions for variance parameters in hierarchical models , 2006 .

[17]  D. Madigan,et al.  Model Selection and Accounting for Model Uncertainty in Graphical Models Using Occam's Window , 1994 .

[18]  B. Yandell,et al.  CAUSAL GRAPHICAL MODELS IN SYSTEMS GENETICS: A UNIFIED FRAMEWORK FOR JOINT INFERENCE OF CAUSAL NETWORK AND GENETIC ARCHITECTURE FOR CORRELATED PHENOTYPES. , 2010, The annals of applied statistics.

[19]  B. Yandell,et al.  Inferring Causal Phenotype Networks From Segregating Populations , 2008, Genetics.

[20]  K. Broman,et al.  A Guide to QTL Mapping with R/qtl , 2009 .

[21]  Satoru Miyano,et al.  Combining Microarrays and Biological Knowledge for Estimating Gene Networks via Bayesian Networks , 2004, J. Bioinform. Comput. Biol..

[22]  Michal Linial,et al.  Using Bayesian Networks to Analyze Expression Data , 2000, J. Comput. Biol..

[23]  John D. Storey,et al.  Harnessing naturally randomized transcription to infer regulatory relationships among genes , 2007, Genome Biology.

[24]  Jun Zhu,et al.  Increasing the Power to Detect Causal Associations by Combining Genotypic and Expression Data in Segregating Populations , 2007, PLoS Comput. Biol..

[25]  A. Gelman Prior distributions for variance parameters in hierarchical models (comment on article by Browne and Draper) , 2004 .

[26]  Magalie S Leduc,et al.  Uncovering genes and regulatory pathways related to urinary albumin excretion. , 2011, Journal of the American Society of Nephrology : JASN.

[27]  J. Castle,et al.  An integrative genomics approach to infer causal associations between gene expression and disease , 2005, Nature Genetics.

[28]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.

[29]  Per Christian Hansen,et al.  Rank-Deficient and Discrete Ill-Posed Problems , 1996 .

[30]  F. Fyhrquist,et al.  Renin‐angiotensin system revisited , 2008, Journal of internal medicine.

[31]  E. George,et al.  Journal of the American Statistical Association is currently published by American Statistical Association. , 2007 .

[32]  Satoru Miyano,et al.  Combining microarrays and biological knowledge for estimating gene networks via Bayesian networks , 2003, Computational Systems Bioinformatics. CSB2003. Proceedings of the 2003 IEEE Bioinformatics Conference. CSB2003.

[33]  J. York,et al.  Bayesian Graphical Models for Discrete Data , 1995 .

[34]  C. Robert,et al.  Bayesian Modeling Using WinBUGS , 2009 .