Fine-mapping across diverse ancestries drives the discovery of putative causal variants underlying human complex traits and diseases

Genome-wide association studies (GWAS) of human complex traits or diseases often implicate genetic loci that span hundreds or thousands of genetic variants, many of which have similar statistical signi[fi]cance. While statistical fine-mapping in individuals of European descent has made important discoveries, cross-population fine-mapping has the potential to improve power and resolution by capitalizing on the genomic diversity across ancestries. Here we present SuSiEx, an accurate and computationally efficient method for cross-population fine-mapping, which builds on the single-population fine-mapping framework, Sum of Single Effects (SuSiE). SuSiEx integrates data from an arbitrary number of ancestries, explicitly models population-specific allele frequencies and LD patterns, accounts for multiple causal variants in a genomic region, and can be applied to GWAS summary statistics when individual-level data is unavailable. We comprehensively evaluated SuSiEx using simulations, a range of quantitative traits measured in both UK Biobank and Taiwan Biobank, and schizophrenia GWAS across East Asian and European ancestries. In all evaluations, SuSiEx fine-mapped more association signals, produced smaller credible sets and higher posterior inclusion probability (PIP) for putative causal variants, and retained population-specific causal variants.

[1]  Jacob C. Ulirsch,et al.  Improving fine-mapping by modeling infinitesimal effects , 2022, bioRxiv.

[2]  Judy H. Cho,et al.  Meta-analysis fine-mapping is often miscalibrated at single-variant resolution , 2022, medRxiv.

[3]  Y. Feng,et al.  Taiwan Biobank: A rich biomedical research database of the Taiwanese population , 2021, medRxiv.

[4]  Wei Zhou,et al.  Global Biobank Meta-analysis Initiative: powering genetic discovery across human diseases , 2021, medRxiv.

[5]  Jacob C. Ulirsch,et al.  Insights from complex trait fine-mapping across diverse populations , 2021, medRxiv.

[6]  Nathan S. Abell,et al.  Multiple Causal Variants Underlie Genetic Associations in Humans , 2021, bioRxiv.

[7]  M. Daly,et al.  Analysis across Taiwan Biobank, Biobank Japan, and UK Biobank identifies hundreds of novel loci for 36 quantitative traits , 2021, medRxiv.

[8]  Adriana I. Iglesias,et al.  Genome-wide meta-analysis identifies 127 open-angle glaucoma loci with consistent effect across ancestries , 2021, Nature Communications.

[9]  Ryan L. Collins,et al.  Author Correction: The mutational constraint spectrum quantified from variation in 141,456 humans , 2021, Nature.

[10]  David R. Kelley,et al.  Leveraging supervised learning for functionally informed fine-mapping of cis-eQTLs identifies an additional 20,913 putative causal eQTLs , 2020, Nature Communications.

[11]  Kyle J. Gaulton,et al.  Multi-ancestry genetic study of type 2 diabetes highlights the power of diverse populations for discovery and translation , 2020, Nature Genetics.

[12]  S. Ripke,et al.  Mapping genomic loci prioritises genes and implicates synaptic biology in schizophrenia , 2020, medRxiv.

[13]  Y. Pawitan,et al.  High-definition likelihood inference of genetic correlations across human complex traits , 2020, Nature Genetics.

[14]  Amy S. Shah,et al.  Fine-mapping, trans-ancestral, and genomic analyses identify causal variants, cells, genes, and drug targets for type 1 diabetes , 2020, Nature Genetics.

[15]  E. Eskin,et al.  Identifying causal variants by fine mapping across multiple studies , 2020, bioRxiv.

[16]  Matti Pirinen,et al.  Functionally-informed fine-mapping and polygenic localization of complex trait heritability , 2019, Nature Genetics.

[17]  Katherine M. Siewert,et al.  Population-specific causal disease effect sizes in functionally important regions impacted by selection , 2019, Nature Communications.

[18]  Ryan L. Collins,et al.  The mutational constraint spectrum quantified from variation in 141,456 humans , 2020, Nature.

[19]  Brielin C. Brown,et al.  Comparative genetic architectures of schizophrenia in East Asian and European populations , 2018, Nature Genetics.

[20]  Peter A. Combs,et al.  Fine-mapping cis-regulatory variants in diverse human populations , 2018, bioRxiv.

[21]  Lauren S. Mogil,et al.  Multiethnic meta-analysis identifies ancestry-specific and cross-ancestry loci for pulmonary function , 2018, Nature Communications.

[22]  Jason D. Buenrostro,et al.  Interrogation of human hematopoiesis at single-cell and single-variant resolution , 2018, bioRxiv.

[23]  M. Pirinen,et al.  Prospects of Fine-Mapping Trait-Associated Genomic Regions by Using Summary Statistics from Genome-wide Association Studies. , 2017, American journal of human genetics.

[24]  Hailiang Huang,et al.  Fine-mapping inflammatory bowel disease loci to single variant resolution , 2017, Nature.

[25]  F. Cunningham,et al.  The Ensembl Variant Effect Predictor , 2016, bioRxiv.

[26]  Gabor T. Marth,et al.  A global reference for human genetic variation , 2015, Nature.

[27]  Matti Pirinen,et al.  FINEMAP: efficient variable selection using summary data from genome-wide association studies , 2015, bioRxiv.

[28]  B. Pasaniuc,et al.  Leveraging Functional-Annotation Data in Trans-ethnic Fine-Mapping Studies. , 2015, American journal of human genetics.

[29]  P. Elliott,et al.  UK Biobank: An Open Access Resource for Identifying the Causes of a Wide Range of Complex Diseases of Middle and Old Age , 2015, PLoS medicine.

[30]  Y. Li,et al.  Trans-ethnic genome-wide association studies: advantages and challenges of mapping in diverse populations , 2014, Genome Medicine.

[31]  Carson C Chow,et al.  Second-generation PLINK: rising to the challenge of larger and richer datasets , 2014, GigaScience.

[32]  E. Eskin,et al.  Integrating Functional Data to Prioritize Causal Variants in Statistical Fine-Mapping Studies , 2014, PLoS genetics.

[33]  Heng Li,et al.  Toward better understanding of artifacts in variant calling from high-coverage samples , 2014, Bioinform..

[34]  Peter Donnelly,et al.  Identification of multiple risk variants for ankylosing spondylitis through high-density genotyping of immune-related loci , 2013, Nature Genetics.

[35]  Arcadi Navarro,et al.  High Trans-ethnic Replicability of GWAS Results Implies Common Causal Variants , 2013, PLoS genetics.

[36]  Huaxi Xu,et al.  Apolipoprotein E and Alzheimer disease: risk, mechanisms and therapy , 2013, Nature Reviews Neurology.

[37]  Jake K. Byrnes,et al.  Bayesian refinement of association signals for 14 loci in 3 common diseases , 2012, Nature Genetics.

[38]  Peter Donnelly,et al.  HAPGEN2: simulation of multiple disease SNPs , 2011, Bioinform..

[39]  Tanya M. Teslovich,et al.  LocusZoom: regional visualization of genome-wide association scan results , 2010, Bioinform..

[40]  Yun Li,et al.  METAL: fast and efficient meta-analysis of genomewide association scans , 2010, Bioinform..

[41]  M. Olivier A haplotype map of the human genome. , 2003, Nature.

[42]  E. George,et al.  APPROACHES FOR BAYESIAN VARIABLE SELECTION , 1997 .

[43]  T. J. Mitchell,et al.  Bayesian Variable Selection in Linear Regression , 1988 .

[44]  B. Meyers,et al.  The emergence , evolution , and diversification of the miR 390-TAS 3-ARF pathway in land plants , 2016 .

[45]  Jaap,et al.  Linkage disequilibrium in human populations , 2003 .