Effectively identifying regulatory hotspots while capturing expression heterogeneity in gene expression studies

Expression quantitative trait loci (eQTL) mapping is a tool that can systematically identify genetic variation affecting gene expression. eQTL mapping studies have shown that certain genomic locations, referred to as regulatory hotspots, may affect the expression levels of many genes. Recently, studies have shown that various confounding factors may induce spurious regulatory hotspots. Here, we introduce a novel statistical method that effectively eliminates spurious hotspots while retaining genuine hotspots. Applied to simulated and real datasets, we validate that our method achieves greater sensitivity while retaining low false discovery rates compared to previous methods.

[1]  H. Stefánsson,et al.  Genetics of gene expression and its effect on disease , 2008, Nature.

[2]  Andrew I Su,et al.  Uncovering regulatory pathways that affect hematopoietic stem cell function using 'genetical genomics' , 2005, Nature Genetics.

[3]  David R Goodlett,et al.  Genetic basis of proteome variation in yeast , 2007, Nature Genetics.

[4]  Yudong D. He,et al.  Effects of atmospheric ozone on microarray data quality. , 2003, Analytical chemistry.

[5]  D. Heckerman,et al.  Efficient Control of Population Structure in Model Organism Association Mapping , 2008, Genetics.

[6]  Stuart L Schreiber,et al.  Genetic basis of individual differences in the response to small-molecule drugs in yeast , 2007, Nature Genetics.

[7]  J. Fuscoe,et al.  Elimination of laboratory ozone leads to a dramatic improvement in the reproducibility of microarray gene expression measurements , 2007, BMC biotechnology.

[8]  Jingyuan Fu,et al.  Regulatory network construction in Arabidopsis by using genome-wide gene expression quantitative trait loci , 2007, Proceedings of the National Academy of Sciences.

[9]  Eleazar Eskin,et al.  Interpreting Meta-Analyses of Genome-Wide Association Studies , 2012, PLoS genetics.

[10]  M. Stephens,et al.  Bayesian statistical methods for genetic association studies , 2009, Nature Reviews Genetics.

[11]  M. McMullen,et al.  A unified mixed-model method for association mapping that accounts for multiple levels of relatedness , 2006, Nature Genetics.

[12]  Eleazar Eskin,et al.  Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies. , 2011, American journal of human genetics.

[13]  Rachel B. Brem,et al.  Trans-acting regulatory variation in Saccharomyces cerevisiae and the role of transcription factors , 2003, Nature Genetics.

[14]  Robert W. Williams,et al.  Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function , 2005, Nature Genetics.

[15]  Eleazar Eskin,et al.  Improved linear mixed models for genome-wide association studies , 2012, Nature Methods.

[16]  L. Kruglyak,et al.  Gene–Environment Interaction in Yeast Gene Expression , 2008, PLoS biology.

[17]  Chun Jimmie Ye,et al.  Accurate Discovery of Expression Quantitative Trait Loci Under Confounding From Spurious and Genuine Regulatory Hotspots , 2008, Genetics.

[18]  P. Donnelly,et al.  A new multipoint method for genome-wide association studies by imputation of genotypes , 2007, Nature Genetics.

[19]  Ying Liu,et al.  FaST linear mixed models for genome-wide association studies , 2011, Nature Methods.

[20]  Thomas R. Sutter,et al.  How replicable are mRNA expression QTL? , 2006, Mammalian Genome.

[21]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[22]  John D. Storey,et al.  Capturing Heterogeneity in Gene Expression Studies by Surrogate Variable Analysis , 2007, PLoS genetics.

[23]  Joshua T. Burdick,et al.  Mapping determinants of human gene expression by regional and genome-wide association , 2005, Nature.

[24]  G. Churchill Fundamentals of experimental design for cDNA microarrays , 2002, Nature Genetics.

[25]  Eric E Schadt,et al.  Integrating QTL and high-density SNP analyses in mice to identify Insig2 as a susceptibility gene for plasma cholesterol levels. , 2005, Genomics.

[26]  Neil D. Lawrence,et al.  Joint Modelling of Confounding Factors and Prominent Genetic Regulators Provides Increased Accuracy in Genetical Genomics Studies , 2012, PLoS Comput. Biol..

[27]  David Higgins,et al.  Haplotype analysis in multiple crosses to identify a QTL gene. , 2004, Genome research.

[28]  Rachel B. Brem,et al.  The landscape of genetic complexity across 5,700 gene expression traits in yeast. , 2005, Proceedings of the National Academy of Sciences of the United States of America.

[29]  Keyan Zhao,et al.  An Arabidopsis Example of Association Mapping in Structured Samples , 2006, PLoS genetics.

[30]  Joshua T. Burdick,et al.  Common genetic variants account for differences in gene expression among ethnic groups , 2007, Nature Genetics.

[31]  A. Beyer,et al.  Detection and interpretation of expression quantitative trait loci (eQTL). , 2009, Methods.

[32]  Jörg Köhl,et al.  Complement factor 5 is a quantitative trait gene that modifies liver fibrogenesis in mice and humans , 2005, Nature Genetics.

[33]  David Heckerman,et al.  Correction for hidden confounders in the genetic analysis of gene expression , 2010, Proceedings of the National Academy of Sciences.

[34]  K. Roeder,et al.  Genomic Control for Association Studies , 1999, Biometrics.