Improved methods for multi-trait fine mapping of pleiotropic risk loci

Motivation: Genome-wide association studies (GWAS) have identified thousands of regions in the genome that contain genetic variants that increase risk for complex traits and diseases. However, the variants uncovered in GWAS are typically not biologically causal, but rather, correlated to the true causal variant through linkage disequilibrium (LD). To discern the true causal variant(s), a variety of statistical fine-mapping methods have been proposed to prioritize variants for functional validation. Results: In this work we introduce a new approach, fastPAINTOR, that leverages evidence across correlated traits, as well as functional annotation data, to improve fine-mapping accuracy at pleiotropic risk loci. To improve computational efficiency, we describe an new importance sampling scheme to perform model inference. First, we demonstrate in simulations that by leveraging functional annotation data, fastPAINTOR increases fine-mapping resolution relative to existing methods. Next, we show that jointly modeling pleiotropic risk regions improves fine-mapping resolution compared to standard single trait and pleiotropic fine mapping strategies. We report a reduction in the number of SNPs required for follow-up in order to capture 90% of the causal variants from 23 SNPs per locus using a single trait to 12 SNPs when fine-mapping two traits simultaneously. Finally, we analyze summary association data from a large-scale GWAS of lipids and show that these improvements are largely sustained in real data. Availability and Implementation: The fastPAINTOR framework is implemented in the PAINTOR v3.0 package which is publicly available to the research community http://bogdan.bioinformatics.ucla.edu/software/paintor Contact: gkichaev@ucla.edu Supplementary information: Supplementary data are available at Bioinformatics online.

[1]  Eleazar Eskin,et al.  Identifying Causal Variants at Loci with Multiple Signals of Association , 2014, Genetics.

[2]  Tanya M. Teslovich,et al.  Discovery and refinement of loci associated with lipid levels , 2013, Nature Genetics.

[3]  Hongyu Zhao,et al.  GPA: A Statistical Approach to Prioritizing GWAS Results by Integrating Pleiotropy and Annotation , 2014, PLoS genetics.

[4]  Judy H. Cho,et al.  Association analyses identify 38 susceptibility loci for inflammatory bowel disease and highlight shared genetic risk across populations , 2015, Nature Genetics.

[5]  J. Ioannidis,et al.  Meta-analysis methods for genome-wide association studies and beyond , 2013, Nature Reviews Genetics.

[6]  Michael Q. Zhang,et al.  Integrative analysis of 111 reference human epigenomes , 2015, Nature.

[7]  Jake K. Byrnes,et al.  Bayesian refinement of association signals for 14 loci in 3 common diseases , 2012, Nature Genetics.

[8]  Matti Pirinen,et al.  FINEMAP: efficient variable selection using summary data from genome-wide association studies , 2015, bioRxiv.

[9]  M. Daly,et al.  An Atlas of Genetic Correlations across Human Diseases and Traits , 2015, Nature Genetics.

[10]  Han Xu,et al.  Partitioning heritability of regulatory and cell-type-specific variants across 11 common diseases. , 2014, American journal of human genetics.

[11]  Manolis Kellis,et al.  FTO Obesity Variant Circuitry and Adipocyte Browning in Humans. , 2015, The New England journal of medicine.

[12]  W. G. Hill,et al.  Genome partitioning of genetic variation for complex traits using common SNPs , 2011, Nature Genetics.

[13]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[14]  Ross M. Fraser,et al.  Defining the role of common variation in the genomic and biological architecture of adult human height , 2014, Nature Genetics.

[15]  Jennifer G. Robinson,et al.  Trans-Ethnic Fine-Mapping of Lipid Loci Identifies Population-Specific Signals and Allelic Heterogeneity That Increases the Trait Variance Explained , 2013, PLoS genetics.

[16]  Yakir A Reshef,et al.  Partitioning heritability by functional annotation using genome-wide association summary statistics , 2015, Nature Genetics.

[17]  ENCODEConsortium,et al.  An Integrated Encyclopedia of DNA Elements in the Human Genome , 2012, Nature.

[18]  Peter Kraft,et al.  Fine-mapping identifies multiple prostate cancer risk loci at 5p15, one of which associates with TERT expression , 2013, Human molecular genetics.

[19]  Tanya M. Teslovich,et al.  Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility , 2014, Nature Genetics.

[20]  Olle Melander,et al.  From noncoding variant to phenotype via SORT1 at the 1p13 cholesterol locus , 2010, Nature.

[21]  Heang-Ping Chan,et al.  Genome-wide association study identifies multiple loci associated with both mammographic density and breast cancer risk , 2022 .

[22]  E. Eskin,et al.  Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies , 2014, PLoS genetics.

[23]  Wei Lu,et al.  Fine-scale mapping of the FGFR2 breast cancer risk locus: putative functional variants differentially bind FOXA1 and E2F1. , 2013, American journal of human genetics.

[24]  S. Purcell,et al.  Pleiotropy in complex traits: challenges and strategies , 2013, Nature Reviews Genetics.

[25]  B. Bernstein,et al.  Charting histone modifications and the functional organization of mammalian genomes , 2011, Nature Reviews Genetics.

[26]  Jun S. Liu,et al.  Genetics of rheumatoid arthritis contributes to biology and drug discovery , 2013 .

[27]  P. Visscher,et al.  Five years of GWAS discovery. , 2012, American journal of human genetics.

[28]  Gregory A. Poland,et al.  Fine Mapping Causal Variants with an Approximate Bayesian Method Using Marginal Test Statistics , 2015, Genetics.

[29]  Kenny Q. Ye,et al.  An integrated map of genetic variation from 1,092 human genomes , 2012, Nature.

[30]  B. Pasaniuc,et al.  Leveraging Functional-Annotation Data in Trans-ethnic Fine-Mapping Studies. , 2015, American journal of human genetics.

[31]  Data production leads,et al.  An integrated encyclopedia of DNA elements in the human genome , 2012 .

[32]  A. Buse The Likelihood Ratio, Wald, and Lagrange Multiplier Tests: An Expository Note , 1982 .

[33]  Gaurav Bhatia,et al.  Fast and accurate imputation of summary statistics enhances evidence of functional enrichment , 2013, Bioinform..

[34]  Donald L. Iglehart,et al.  Importance sampling for stochastic simulations , 1989 .

[35]  Peter Donnelly,et al.  HAPGEN2: simulation of multiple disease SNPs , 2011, Bioinform..