Marker-Trait Complete Analysis

A recurring problem in genomics involves testing association of one or more traits of interest to multiple genomic features. Feature-trait squared correlations r2 are commonly-used statistics, sensitive to trend associations. It is often of interest to perform testing across collections {r2} over markers and/or traits using both maxima and sums. However, both trait-trait correlations and marker-marker correlations may be strong and must be considered. The primary tools for multiple testing suffer from various shortcomings, including p-value inaccuracies due to asymptotic methods that may not be applicable. Moreover, there is a lack of general tools for fast screening and follow-up of regions of interest.To address these difficulties, we propose the MTCA approach, for Marker-Trait Complete Analysis. MTCA encompasses a large number of existing approaches, and provides accurate p-values over markers and traits for maxima and sums of r2 statistics. MTCA uses the conditional inference implicit in permutation as a motivational frame-work, but provides an option for fast screening with two novel tools: (i) a multivariate-normal approximation for the max statistic, and (ii) the concept of eigenvalue-conditional moments for the sum statistic. We provide examples for gene-based association testing of a continuous phenotype and cis-eQTL analysis, but MTCA can be applied in a much wider variety of settings and platforms.

[1]  J. Marron,et al.  Computation of ancestry scores with mixed families and unrelated individuals , 2016, Biometrics.

[2]  Roderic Guigó,et al.  Identification of genetic variants associated with alternative splicing using sQTLseekeR , 2014, Nature Communications.

[3]  Fred A. Wright,et al.  Empirical pathway analysis, without permutation , 2013, Biostatistics.

[4]  A. Agresti,et al.  Approximate is Better than “Exact” for Interval Estimation of Binomial Proportions , 1998 .

[5]  J. Rommens,et al.  Genetic Modifiers of Cystic Fibrosis–Related Diabetes , 2013, Diabetes.

[6]  Nicola J. Rinaldi,et al.  Genetic effects on gene expression across human tissues , 2017, Nature.

[7]  J. Besag,et al.  Sequential Monte Carlo p-values , 1991 .

[8]  Xiaolin Xu,et al.  Space–time clustering and the permutation moments of quadratic forms , 2013, Stat.

[9]  J. Rommens,et al.  Genome-wide association meta-analysis identifies five modifier loci of lung disease severity in cystic fibrosis , 2015, Nature Communications.

[10]  James J. Chen,et al.  Applying genome-wide gene-based expression quantitative trait locus mapping to study population ancestry and pharmacogenetics , 2014, BMC Genomics.

[11]  Johnny S. H. Kwan,et al.  GATES: a rapid and powerful gene-based association test using extended Simes procedure. , 2011, American journal of human genetics.

[12]  Xihong Lin,et al.  Rare-variant association testing for sequencing data with the sequence kernel association test. , 2011, American journal of human genetics.

[13]  Wei Pan,et al.  Powerful and Adaptive Testing for Multi-trait and Multi-SNP Associations with GWAS and Sequencing Data , 2016, Genetics.