Mapping eQTL Networks with Mixed Graphical Markov Models

Expression quantitative trait loci (eQTL) mapping constitutes a challenging problem due to, among other reasons, the high-dimensional multivariate nature of gene-expression traits. Next to the expression heterogeneity produced by confounding factors and other sources of unwanted variation, indirect effects spread throughout genes as a result of genetic, molecular, and environmental perturbations. From a multivariate perspective one would like to adjust for the effect of all of these factors to end up with a network of direct associations connecting the path from genotype to phenotype. In this article we approach this challenge with mixed graphical Markov models, higher-order conditional independences, and q-order correlation graphs. These models show that additive genetic effects propagate through the network as function of gene–gene correlations. Our estimation of the eQTL network underlying a well-studied yeast data set leads to a sparse structure with more direct genetic and regulatory associations that enable a straightforward comparison of the genetic control of gene expression across chromosomes. Interestingly, it also reveals that eQTLs explain most of the expression variability of network hub genes.

[1]  N. L. Johnson,et al.  Linear Statistical Inference and Its Applications , 1966 .

[2]  M. Ashburner,et al.  Gene Ontology: tool for the unification of biology , 2000, Nature Genetics.

[3]  Michael I. Jordan Graphical Models , 1998 .

[4]  D. Edwards Introduction to graphical modelling , 1995 .

[5]  Eleazar Eskin,et al.  Detecting the Presence and Absence of Causal Relationships between Expression of Yeast Genes with Very Few Samples , 2010, J. Comput. Biol..

[6]  A. G. de la Fuente,et al.  Gene Network Inference via Structural Equation Modeling in Genetical Genomics Experiments , 2008, Genetics.

[7]  Robert Castelo,et al.  A Robust Procedure For Gaussian Graphical Model Search From Microarray Data With p Larger Than n , 2006, J. Mach. Learn. Res..

[8]  Hyonho Chun,et al.  Expression Quantitative Trait Loci Mapping With Multivariate Sparse Partial Least Squares Regression , 2009, Genetics.

[9]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[10]  R. Planta,et al.  Transcriptional regulation of the Saccharomyces cerevisiae amino acid permease gene BAP2 , 2001, Molecular and General Genetics MGG.

[11]  L. Kruglyak,et al.  Genetics of global gene expression , 2006, Nature Reviews Genetics.

[12]  George A. F. Seber,et al.  A matrix handbook for statisticians , 2007 .

[13]  David Heckerman,et al.  Correction for hidden confounders in the genetic analysis of gene expression , 2010, Proceedings of the National Academy of Sciences.

[14]  M. Peters,et al.  Systematic identification of trans eQTLs as putative drivers of known disease associations , 2013, Nature Genetics.

[15]  Hao Wu,et al.  R/qtl: QTL Mapping in Experimental Crosses , 2003, Bioinform..

[16]  I. Rigoutsos,et al.  The complex transcriptional landscape of the anucleate human platelet , 2013, BMC Genomics.

[17]  K. Schughart,et al.  Data-driven assessment of eQTL mapping methods , 2010, BMC Genomics.

[18]  N. Bing,et al.  Genetical Genomics Analysis of a Yeast Segregant Population for Transcription Network Inference , 2005, Genetics.

[19]  Ritsert C Jansen,et al.  eQTL analysis in mice and rats. , 2009, Methods in molecular biology.

[20]  Gary D Bader,et al.  The Genetic Landscape of a Cell , 2010, Science.

[21]  Inanç Birol,et al.  Hive plots - rational approach to visualizing networks , 2012, Briefings Bioinform..

[22]  B. Yandell,et al.  CAUSAL GRAPHICAL MODELS IN SYSTEMS GENETICS: A UNIFIED FRAMEWORK FOR JOINT INFERENCE OF CAUSAL NETWORK AND GENETIC ARCHITECTURE FOR CORRELATED PHENOTYPES. , 2010, The annals of applied statistics.

[23]  R. Guigó,et al.  Transcriptome genetics using second generation sequencing in a Caucasian population , 2010, Nature.

[24]  P. Visscher,et al.  Estimating missing heritability for disease from genome-wide association studies. , 2011, American journal of human genetics.

[25]  Gordon K. Smyth,et al.  limma: Linear Models for Microarray Data , 2005 .

[26]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[27]  Albert-László Barabási,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002 .

[28]  E. Lehmann Testing Statistical Hypotheses , 1960 .

[29]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.

[30]  L. Kruglyak,et al.  Finding the sources of missing heritability in a yeast cross , 2012, Nature.

[31]  Jaeyoung Choi,et al.  Fungal plant cell wall-degrading enzyme database: a platform for comparative and evolutionary genomics in fungi and Oomycetes , 2013, BMC Genomics.

[32]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[33]  Vivian G. Cheung,et al.  Genetics of human gene expression: mapping DNA variants that influence gene expression , 2009, Nature Reviews Genetics.

[34]  B. Yandell,et al.  Inferring Causal Phenotype Networks From Segregating Populations , 2008, Genetics.

[35]  K. Broman,et al.  A Guide to QTL Mapping with R/qtl , 2009 .

[36]  David Edwards,et al.  Selecting high-dimensional mixed graphical models using minimal AIC or BIC forests , 2010, BMC Bioinformatics.

[37]  Jingyuan Fu,et al.  Genetical Genomics: Spotlight on QTL Hotspots , 2008, PLoS genetics.

[38]  Gordon K. Smyth,et al.  A comparison of background correction methods for two-colour microarrays , 2007, Bioinform..

[39]  Calyampudi R. Rao,et al.  Linear Statistical Inference and Its Applications. , 1975 .

[40]  Leopold Parts,et al.  A Bayesian Framework to Account for Complex Non-Genetic Factors in Gene Expression Levels Greatly Increases Power in eQTL Studies , 2010, PLoS Comput. Biol..

[41]  E. Xing,et al.  Statistical Estimation of Correlated Genome Associations to a Quantitative Trait Network , 2009, PLoS genetics.

[42]  F. Vannberg,et al.  GENETICS OF GENE EXPRESSION IN PRIMARY IMMUNE CELLS IDENTIFIES CELL-SPECIFIC MASTER REGULATORS AND ROLES OF HLA ALLELES , 2012, Nature Genetics.

[43]  John D. Storey,et al.  Harnessing naturally randomized transcription to infer regulatory relationships among genes , 2007, Genome Biology.

[44]  R. Durbin,et al.  Joint Genetic Analysis of Gene Expression Data with Inferred Cellular Phenotypes , 2011, PLoS genetics.

[45]  Robert Castelo,et al.  Reverse Engineering Molecular Regulatory Networks from Microarray Data with qp-Graphs , 2009, J. Comput. Biol..

[46]  N. Wermuth,et al.  Graphical Models for Associations between Variables, some of which are Qualitative and some Quantitative , 1989 .

[47]  G. Kollias,et al.  Position-independent, high-level expression of the human β-globin gene in transgenic mice , 1987, Cell.

[48]  Elias Chaibub Neto,et al.  Modeling Causality for Pairs of Phenotypes in System Genetics , 2013, Genetics.

[49]  David Edwards,et al.  Collapsibility of Graphical CG‐Regression Models , 2004 .

[50]  Frank Harary,et al.  Graph Theory , 2016 .

[51]  Ross E. Curtis,et al.  Structured association analysis leads to insight into Saccharomyces cerevisiae gene regulation by finding multiple contributing eQTL hotspots associated with functional gene modules , 2013, BMC Genomics.

[52]  David M. Simcha,et al.  Tackling the widespread and critical impact of batch effects in high-throughput data , 2010, Nature Reviews Genetics.

[53]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[54]  M. Rockman,et al.  Reverse engineering the genotype–phenotype map with natural genetic variation , 2008, Nature.

[55]  Catarina Costa,et al.  The YEASTRACT database: an upgraded information system for the analysis of gene and genomic transcription regulation in Saccharomyces cerevisiae , 2013, Nucleic Acids Res..

[56]  Xiangdong Fang,et al.  Locus control regions. , 2002, Blood.

[57]  Terry Speed,et al.  Normalization of cDNA microarray data. , 2003, Methods.

[58]  Rachel B. Brem,et al.  Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors , 2003, Nature Genetics.

[59]  Peter Bühlmann,et al.  Estimating High-Dimensional Directed Acyclic Graphs with the PC-Algorithm , 2007, J. Mach. Learn. Res..

[60]  Nicholas C. Wormald,et al.  Generating Random Regular Graphs Quickly , 1999, Combinatorics, Probability and Computing.

[61]  A. Roverato Hyper Inverse Wishart Distribution for Non-decomposable Graphs and its Application to Bayesian Inference for Gaussian Graphical Models , 2002 .

[62]  J. Zhu,et al.  An integrative genomics approach to the reconstruction of gene networks in segregating populations , 2004, Cytogenetic and Genome Research.

[63]  Victor Chubukov,et al.  Dynamics and Design Principles of a Basic Regulatory Architecture Controlling Metabolic Pathways , 2008, PLoS Biology.

[64]  C. Kendziorski,et al.  Statistical Methods for Expression Quantitative Trait Loci (eQTL) Mapping , 2006, Biometrics.

[65]  Chun Jimmie Ye,et al.  Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots , 2008, Genetics.

[66]  Rachel B. Brem,et al.  Integrating large-scale functional genomic data to dissect the complexity of yeast regulatory networks , 2008, Nature Genetics.

[67]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[68]  C. Myers,et al.  Genetic interaction networks: toward an understanding of heritability. , 2013, Annual review of genomics and human genetics.

[69]  Enrico Petretto,et al.  Heritability and Tissue Specificity of Expression Quantitative Trait Loci , 2006, PLoS genetics.

[70]  Stephen E. Fienberg,et al.  Testing Statistical Hypotheses , 2005 .

[71]  Charles R. Johnson,et al.  Positive definite completions of partial Hermitian matrices , 1984 .

[72]  Yudong D. He,et al.  Functional Discovery via a Compendium of Expression Profiles , 2000, Cell.