Joint testing of rare variant burden scores using non-negative least squares

Gene-based burden tests are a popular and powerful approach for analysis of exome-wide association studies. These approaches combine sets of variants within a gene into a single burden score that is then tested for association. Typically, a range of burden scores are calculated and tested across a range of annotation classes and frequency bins. Correlation between these tests can complicate the multiple testing correction and hamper interpretation of the results. We introduce a new method called the Sparse Burden Association Test (SBAT) that tests the joint set of burden scores under the assumption that causal burden scores act in the same effect direction. The method simultaneously assesses the significance of the model fit and selects the set of burden scores that best explain the association at the same time. Using simulated data, we show that the method is well calibrated and highlight some scenarios where the test outperforms existing gene-based tests. We apply the method to 73 quantitative traits from the UK Biobank which further illustrates the power of the method. This test is implemented in the REGENIE software.

[1]  J. Marchini,et al.  Exome sequencing and analysis of 454,787 UK Biobank participants , 2021, Nature.

[2]  P. Kraft,et al.  Multitrait GWAS to connect disease variants and biological mechanisms , 2021, PLoS genetics.

[3]  R. Collins,et al.  Sequencing of 640,000 exomes identifies GPR75 variants associated with protection from obesity , 2021, Science.

[4]  Gonçalo Abecasis,et al.  Computationally efficient whole-genome regression for quantitative and binary traits , 2020, Nature Genetics.

[5]  Alexander E. Lopez,et al.  Exome sequencing and characterization of 49,960 individuals in the UK Biobank , 2020, Nature.

[6]  Ivana V. Yang,et al.  Dynamic incorporation of multiple in silico functional annotations empowers rare variant association analysis of large whole genome sequencing studies at scale , 2020, Nature Genetics.

[7]  William J. Astle,et al.  The Polygenic and Monogenic Basis of Blood Traits and Diseases , 2020, Cell.

[8]  Jun Xie,et al.  Cauchy Combination Test: A Powerful Test With Analytic p-Value Calculation Under Arbitrary Dependency Structures , 2018, Journal of the American Statistical Association.

[9]  M. Kanai,et al.  Characterizing rare and low-frequency height-associated variants in the Japanese population , 2019, Nature Communications.

[10]  Xihong Lin,et al.  ACAT: A Fast and Powerful P-value Combination Method for Rare-variant Analysis in Sequencing Studies , 2018, bioRxiv.

[11]  P. Donnelly,et al.  The UK Biobank resource with deep phenotyping and genomic data , 2018, Nature.

[12]  Marylyn D. Ritchie,et al.  Distribution and clinical impact of functional variants in 50,726 whole-exome sequences from the DiscovEHR study , 2016, Science.

[13]  William J. Astle,et al.  Allelic Landscape of Human Blood Cell Trait Variation and Links , 2016 .

[14]  Joris M. Mooij,et al.  MAGMA: Generalized Gene-Set Analysis of GWAS Data , 2015, PLoS Comput. Biol..

[15]  Christian Gieger,et al.  Genome-wide meta-analysis identifies 11 new loci for anthropometric traits and provides insights into genetic architecture , 2013, Nature Genetics.

[16]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.

[17]  P. Visscher,et al.  Conditional and joint multiple-SNP analysis of GWAS summary statistics identifies additional variants influencing complex traits , 2012, Nature Genetics.

[18]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[19]  Shamil R Sunyaev,et al.  Pooled association tests for rare variants in exon-resequencing studies. , 2010, American journal of human genetics.

[20]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[21]  David Allen,et al.  Likelihood Ratio Test , 2009, Encyclopedia of Biometrics.

[22]  A. Bowcock,et al.  Asymmetric lower-limb malformations in individuals with homeobox PITX1 gene mutation. , 2008, American journal of human genetics.

[23]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[24]  S. Bohlander ETV6: a versatile player in leukemogenesis. , 2005, Seminars in cancer biology.

[25]  Concepción Rodríguez-Esteban,et al.  Role of the Bicoid-related homeodomain factor Pitx1 in specifying hindlimb morphogenesis and pituitary development. , 1999, Genes & development.

[26]  R. Bro,et al.  A fast non‐negativity‐constrained least squares algorithm , 1997 .

[27]  A. Shapiro Towards a unified theory of inequality constrained testing in multivariate analysis , 1988 .

[28]  C. Gouriéroux,et al.  Likelihood Ratio Test, Wald Test, and Kuhn-Tucker Test in Linear Models with Inequality Constraints on the Regression Parameters , 1982 .

[29]  工藤 昭夫,et al.  A Multivariate Analogue of the One-Sided Testについての一注意 (多次元統計解析の数理的研究) , 1979 .