Weighted functional linear regression models for gene-based association analysis

Functional linear regression models are effectively used in gene-based association analysis of complex traits. These models combine information about individual genetic variants, taking into account their positions and reducing the influence of noise and/or observation errors. To increase the power of methods, where several differently informative components are combined, weights are introduced to give the advantage to more informative components. Allele-specific weights have been introduced to collapsing and kernel-based approaches to gene-based association analysis. Here we have for the first time introduced weights to functional linear regression models adapted for both independent and family samples. Using data simulated on the basis of GAW17 genotypes and weights defined by allele frequencies via the beta distribution, we demonstrated that type I errors correspond to declared values and that increasing the weights of causal variants allows the power of functional linear models to be increased. We applied the new method to real data on blood pressure from the ORCADES sample. Five of the six known genes with P < 0.1 in at least one analysis had lower P values with weighted models. Moreover, we found an association between diastolic blood pressure and the VMP1 gene (P = 8.18×10−6), when we used a weighted functional model. For this gene, the unweighted functional and weighted kernel-based models had P = 0.004 and 0.006, respectively. The new method has been implemented in the program package FREGAT, which is freely available at https://cran.r-project.org/web/packages/FREGAT/index.html.

[1]  August G. Wang,et al.  Whole-exome sequencing of individuals from an isolated population implicates rare risk variants in bipolar disorder , 2017, Translational Psychiatry.

[2]  T. Axenovich,et al.  Functional linear models for region-based association analysis , 2016, Russian Journal of Genetics.

[3]  Peng Wei,et al.  Incorporating ENCODE information into association analysis of whole genome sequencing data , 2016, BMC Proceedings.

[4]  Kelsey Grinde,et al.  A general method for combining different family-based rare-variant tests of association to improve power and robustness of a wide range of genetic architectures , 2016, BMC Proceedings.

[5]  Xuexia Wang,et al.  Testing rare variants for hypertension using family-based tests with different weighting schemes , 2016, BMC Proceedings.

[6]  Claude Bouchard,et al.  Meta-analysis identifies common and rare variants influencing blood pressure and overlapping with metabolic trait loci , 2016, Nature Genetics.

[7]  Nadezhda M. Belonogova,et al.  FREGAT: an R package for region-based association analysis , 2016, Bioinform..

[8]  Hans-Georg Müller,et al.  Functional Data Analysis , 2016 .

[9]  G R Svishcheva,et al.  Some pitfalls in application of functional data analysis approach to association studies , 2016, Scientific Reports.

[10]  T. Axenovich,et al.  Region-Based Association Test for Familial Data under Functional Linear Models , 2015, PloS one.

[11]  G. Abecasis,et al.  Rare-variant association analysis: study designs and statistical tests. , 2014, American journal of human genetics.

[12]  Zheyang Wu,et al.  A goodness-of-fit association test for whole genome sequencing data , 2014, BMC Proceedings.

[13]  C. Greenwood,et al.  Exploring the potential benefits of stratified false discovery rates for region-based testing of association with rare genetic variation , 2014, Front. Genet..

[14]  Momiao Xiong,et al.  Functional Linear Models for Association Analysis of Quantitative Traits , 2013, Genetic epidemiology.

[15]  Lei Sun,et al.  Robust and Powerful Tests for Rare Variants Using Fisher's Method to Combine Evidence of Association From Two or More Complementary Tests , 2013, Genetic epidemiology.

[16]  Fredrik Nyberg,et al.  Genome-Wide Association Study Evaluating Lipoprotein-Associated Phospholipase A2 Mass and Activity at Baseline and After Rosuvastatin Therapy , 2012, Circulation. Cardiovascular genetics.

[17]  Xihong Lin,et al.  Optimal tests for rare variant effects in sequencing association studies. , 2012, Biostatistics.

[18]  Momiao Xiong,et al.  Quantitative trait locus analysis for next-generation sequencing with the functional linear models , 2012, Journal of Medical Genetics.

[19]  Juan Manuel Peralta,et al.  Genetic Analysis Workshop 17 mini-exome simulation , 2011, BMC proceedings.

[20]  Wei Zheng,et al.  Collapsing-based and kernel-based single-gene analyses applied to Genetic Analysis Workshop 17 mini-exome data , 2011, BMC proceedings.

[21]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[22]  Kathryn Roeder,et al.  Testing for an Unusual Distribution of Rare Variants , 2011, PLoS genetics.

[23]  Deanne M. Taylor,et al.  Powerful SNP-set analysis for case-control genome-wide association studies. , 2010, American journal of human genetics.

[24]  Jason H. Moore,et al.  Missing heritability and strategies for finding the underlying causes of complex disease , 2010, Nature Reviews Genetics.

[25]  Wei Pan,et al.  A Data-Adaptive Sum Test for Disease Association with Multiple Common or Rare Variants , 2010, Human Heredity.

[26]  P. Bork,et al.  A method and server for predicting damaging missense mutations , 2010, Nature Methods.

[27]  E. Zeggini,et al.  An Evaluation of Statistical Approaches to Rare Variant Analysis in Genetic Association Studies , 2009, Genetic epidemiology.

[28]  S. Browning,et al.  A Groupwise Association Test for Rare Mutations Using a Weighted Sum Statistic , 2009, PLoS genetics.

[29]  S. Leal,et al.  Methods for detecting associations with rare variants for common diseases: application to analysis of sequence data. , 2008, American journal of human genetics.

[30]  Igor Rudan,et al.  Runs of homozygosity in European populations. , 2008, American journal of human genetics.

[31]  Dawei Liu,et al.  Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models , 2008, BMC Bioinformatics.

[32]  Xihong Lin,et al.  A powerful and flexible multilocus association test for quantitative traits. , 2008, American journal of human genetics.

[33]  Xihong Lin,et al.  Semiparametric Regression of Multidimensional Genetic Pathway Data: Least‐Squares Kernel Machines and Linear Mixed Models , 2007, Biometrics.

[34]  Claudia Hemmelmann,et al.  Statistical analysis of rare sequence variants: an overview of collapsing methods , 2011, Genetic epidemiology.

[35]  Shamil R Sunyaev,et al.  Pooled association tests for rare variants in exon-resequencing studies. , 2010, American journal of human genetics.

[36]  Robert C Elston,et al.  The genetic basis of complex traits: rare variants or "common gene, common disease"? , 2007, Methods in molecular biology.