A two-phase procedure for non-normal quantitative trait genetic association study

BackgroundThe nonparametric trend test (NPT) is well suitable for identifying the genetic variants associated with quantitative traits when the trait values do not satisfy the normal distribution assumption. If the genetic model, defined according to the mode of inheritance, is known, the NPT derived under the given genetic model is optimal. However, in practice, the genetic model is often unknown beforehand. The NPT derived from an uncorrected model might result in loss of power. When the underlying genetic model is unknown, a robust test is preferred to maintain satisfactory power.ResultsWe propose a two-phase procedure to handle the uncertainty of the genetic model for non-normal quantitative trait genetic association study. First, a model selection procedure is employed to help choose the genetic model. Then the optimal test derived under the selected model is constructed to test for possible association. To control the type I error rate, we derive the joint distribution of the test statistics developed in the two phases and obtain the proper size.ConclusionsThe proposed method is more robust than existing methods through the simulation results and application to gene DNAH9 from the Genetic Analysis Workshop 16 for associated with Anti-cyclic citrullinated peptide antibody further demonstrate its performance.

[1]  Andrew D. Johnson,et al.  Parent-of-origin specific allelic associations among 106 genomic loci for age at menarche , 2014, Nature.

[2]  W. Kruskal,et al.  Use of Ranks in One-Criterion Variance Analysis , 1952 .

[3]  Thomas Lumley,et al.  Sequence Kernel Association Test for Survival Traits , 2014, Genetic epidemiology.

[4]  Qizhai Li,et al.  Improved correction for population stratification in genome‐wide association studies by identifying hidden population structures , 2008, Genetic epidemiology.

[5]  Qizhai Li,et al.  Nonparametric Risk and Nonparametric Odds in Quantitative Genetic Association Studies , 2015, Scientific Reports.

[6]  A genome-wide expression quantitative trait loci analysis of proprotein convertase subtilisin/kexin enzymes identifies a novel regulatory gene variant for FURIN expression and blood pressure , 2015, Human Genetics.

[7]  V. Pungpapong,et al.  Simultaneous genome-wide association studies of anti-cyclic citrullinated peptide in rheumatoid arthritis using penalized orthogonal-components regression , 2009, BMC proceedings.

[8]  Richard M Watanabe,et al.  A principal-components-based clustering method to identify multiple variants associated with rheumatoid arthritis and arthritis-related autoantibodies , 2009, BMC proceedings.

[9]  Christopher I Amos,et al.  Data for Genetic Analysis Workshop 16 Problem 1, association analysis of rheumatoid arthritis data , 2009, BMC proceedings.

[10]  Ross M. Fraser,et al.  Genetic studies of body mass index yield new insights for obesity biology , 2015, Nature.

[11]  Ayellet V. Segrè,et al.  Hundreds of variants clustered in genomic loci and biological pathways affect human height , 2010, Nature.

[12]  Colin O. Wu,et al.  Joint Analysis of Binary and Quantitative Traits With Data Sharing and Outcome‐Dependent Sampling , 2012, Genetic epidemiology.

[13]  A. R. Jonckheere,et al.  A DISTRIBUTION-FREE k-SAMPLE TEST AGAINST ORDERED ALTERNATIVES , 1954 .

[14]  Simon C. Potter,et al.  Genome-wide association study of 14,000 cases of seven common diseases and 3,000 shared controls , 2007, Nature.

[15]  T. J. Terpstra,et al.  The asymptotic normality and consistency of kendall's test against trend, when ties are present in one ranking , 1952 .

[16]  J. Klotz,et al.  Statistical methods for the analysis of tumor multiplicity data. , 1981, Cancer research.