Bayesian Detection of Expression Quantitative Trait Loci Hot Spots

High-throughput genomics allows genome-wide quantification of gene expression levels in tissues and cell types and, when combined with sequence variation data, permits the identification of genetic control points of expression (expression QTL or eQTL). Clusters of eQTL influenced by single genetic polymorphisms can inform on hotspots of regulation of pathways and networks, although very few hotspots have been robustly detected, replicated, or experimentally verified. Here we present a novel modeling strategy to estimate the propensity of a genetic marker to influence several expression traits at the same time, based on a hierarchical formulation of related regressions. We implement this hierarchical regression model in a Bayesian framework using a stochastic search algorithm, HESS, that efficiently probes sparse subsets of genetic markers in a high-dimensional data matrix to identify hotspots and to pinpoint the individual genetic effects (eQTL). Simulating complex regulatory scenarios, we demonstrate that our method outperforms current state-of-the-art approaches, in particular when the number of transcripts is large. We also illustrate the applicability of HESS to diverse real-case data sets, in mouse and human genetic settings, and show that it provides new insights into regulatory hotspots that were not detected by conventional methods. The results suggest that the combination of our modeling strategy and algorithmic implementation provides significant advantages for the identification of functional eQTL hotspots, revealing key regulators underlying pathways.

[1]  Jingyuan Fu,et al.  Genetical Genomics: Spotlight on QTL Hotspots , 2008, PLoS genetics.

[2]  B. Yandell,et al.  Bayesian Quantitative Trait Loci Mapping for Multiple Traits , 2008, Genetics.

[3]  Robert B. Gramacy,et al.  Importance tempering , 2007, Stat. Comput..

[4]  Martin Vingron,et al.  A trans-acting locus regulates an anti-viral expression network and type 1 diabetes risk , 2010, Nature.

[5]  Shizhong Xu,et al.  Mapping Quantitative Trait Loci for Expression Abundance , 2007, Genetics.

[6]  L. Liang,et al.  Mapping complex disease traits with global gene expression , 2009, Nature Reviews Genetics.

[7]  CHENWU XU,et al.  Mapping QTL for multiple traits using Bayesian statistics. , 2009, Genetics research.

[8]  Eric E Schadt,et al.  A Model Selection Approach for Expression Quantitative Trait Loci (eQTL) Mapping , 2011, Genetics.

[9]  E. Schadt Molecular networks as sensors and drivers of common human diseases , 2009, Nature.

[10]  T. Merriman,et al.  An autoimmune diabetes locus (Idd21) on mouse Chromosome 18 , 2003, Mammalian Genome.

[11]  M. Olivier A haplotype map of the human genome , 2003, Nature.

[12]  J. Ibrahim,et al.  Genomewide Multiple-Loci Mapping in Experimental Crosses by Iterative Adaptive Penalized Regression , 2010, Genetics.

[13]  D. Ghosh,et al.  The false discovery rate: a variable selection perspective , 2006 .

[14]  Hugh Chipman,et al.  Bayesian variable selection with related predictors , 1995, bayes-an/9510001.

[15]  Matthias Heinig,et al.  New Insights into the Genetic Control of Gene Expression using a Bayesian Multi-tissue Approach , 2010, PLoS Comput. Biol..

[16]  N. Yi,et al.  Advances in Bayesian multiple quantitative trait loci mapping in experimental crosses , 2008, Heredity.

[17]  N. Yi,et al.  Bayesian LASSO for Quantitative Trait Loci Mapping , 2008, Genetics.

[18]  Gareth O. Roberts,et al.  Towards optimal scaling of metropolis-coupled Markov chain Monte Carlo , 2011, Stat. Comput..

[19]  N. Hirokawa,et al.  A Novel Motor, KIF13A, Transports Mannose-6-Phosphate Receptor to Plasma Membrane through Direct Interaction with AP-1 Complex , 2000, Cell.

[20]  Christina Kendziorski,et al.  Combined Expression Trait Correlations and Expression Quantitative Trait Locus Mapping , 2006, PLoS genetics.

[21]  Marc Chadeau-Hyam,et al.  ESS++: a C++ objected-oriented algorithm for Bayesian stochastic search model exploration , 2011, Bioinform..

[22]  S. Q. s3idChMn,et al.  Evolutionary Monte Carlo: Applications to C_p Model Sampling and Change Point Problem , 2000 .

[23]  Nengjun Yi,et al.  An Efficient Bayesian Model Selection Approach for Interacting Quantitative Trait Loci Models With Many Effects , 2007, Genetics.

[24]  Gareth O. Roberts,et al.  Examples of Adaptive MCMC , 2009 .

[25]  Hyonho Chun,et al.  Expression Quantitative Trait Loci Mapping With Multivariate Sparse Partial Least Squares Regression , 2009, Genetics.

[26]  S. Horvath,et al.  Variations in DNA elucidate molecular networks that cause disease , 2008, Nature.

[27]  Rachel B. Brem,et al.  Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors , 2003, Nature Genetics.

[28]  Jacek Majewski,et al.  The study of eQTL variations by RNA-seq: from SNPs to phenotypes. , 2011, Trends in genetics : TIG.

[29]  S. Richardson,et al.  Bayesian Models for Sparse Regression Analysis of High Dimensional Data , 2012 .

[30]  David A. Drubin,et al.  Learning a Prior on Regulatory Potential from eQTL Data , 2009, PLoS genetics.

[31]  S. Richardson,et al.  Interpreting Posterior Relative Risk Estimates in Disease-Mapping Studies , 2004, Environmental health perspectives.

[32]  Wei Zhang,et al.  A Bayesian Partition Method for Detecting Pleiotropic and Epistatic eQTL Modules , 2010, PLoS Comput. Biol..

[33]  C. Kendziorski,et al.  Statistical Methods for Expression Quantitative Trait Loci (eQTL) Mapping , 2006, Biometrics.

[34]  Robert Kohn,et al.  Nonparametric regression using linear combinations of basis functions , 2001, Stat. Comput..

[35]  Broome,et al.  Literature cited , 1924, A Guide to the Carnivores of Central America.

[36]  Serge Batalov,et al.  Gene Set Enrichment in eQTL Data Identifies Novel Annotations and Pathway Regulators , 2008, PLoS genetics.

[37]  Sylvia Richardson,et al.  Evolutionary Stochastic Search for Bayesian model exploration , 2010, 1002.2706.

[38]  K. Broman,et al.  A Guide to QTL Mapping with R/qtl , 2009 .

[39]  Paul Scheet,et al.  A fast and flexible statistical model for large-scale population genotype data: applications to inferring missing genotypes and haplotypic phase. , 2006, American journal of human genetics.

[40]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[41]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[42]  Silke Szymczak,et al.  Genetics and Beyond – The Transcriptome of Human Monocytes and Disease Susceptibility , 2010, PloS one.