Rank truncated product of P‐values, with application to genomewide association scans

Large exploratory studies are often characterized by a preponderance of true null hypotheses, with a small though multiple number of false hypotheses. Traditional multiple‐test adjustments consider either each hypothesis separately, or all hypotheses simultaneously, but it may be more desirable to consider the combined evidence for subsets of hypotheses, in order to reduce the number of hypotheses to a manageable size. Previously, Zaykin et al. ([2002] Genet. Epidemiol. 22:170–185) proposed forming the product of all P‐values at less than a preset threshold, in order to combine evidence from all significant tests. Here we consider a complementary strategy: form the product of the K most significant P‐values. This has certain advantages for genomewide association scans: K can be chosen on the basis of a hypothesised disease model, and is independent of sample size. Furthermore, the alternative hypothesis corresponds more closely to the experimental situation where all loci have fixed effects. We give the distribution of the rank truncated product and suggest some methods to account for correlated tests in genomewide scans. We show that, under realistic scenarios, it provides increased power to detect genomewide association, while identifying a candidate set of good quality and fixed size for follow‐up studies. Genet Epidemiol 25:360–366, 2003. © 2003 Wiley‐Liss, Inc.

[1]  H. Keselman,et al.  Multiple Comparison Procedures , 2005 .

[2]  D W Fulker,et al.  An improved multipoint sib-pair analysis of quantitative traits , 1996, Behavior genetics.

[3]  R. Fisher,et al.  Statistical Methods for Research Workers , 1930, Nature.

[4]  D. Zaykin,et al.  Using the false discovery rate approach in the genetic dissection of complex traits: a response to Weller et al. , 2000, Genetics.

[5]  Y. Hochberg A sharper Bonferroni procedure for multiple tests of significance , 1988 .

[6]  B. J. Carey,et al.  Chromosome-wide distribution of haplotype blocks and the role of recombination hot spots , 2003, Nature Genetics.

[7]  R. Simes,et al.  An improved Bonferroni procedure for multiple tests of significance , 1986 .

[8]  Ruzong Fan,et al.  Genome association studies of complex diseases by case-control designs. , 2003, American journal of human genetics.

[9]  A. Tamhane,et al.  Multiple Comparison Procedures , 1989 .

[10]  D. Curtis,et al.  Monte Carlo tests for associations between disease and alleles at highly polymorphic loci , 1995, Annals of human genetics.

[11]  K. Gabriel,et al.  On closed testing procedures with special reference to ordered analysis of variance , 1976 .

[12]  R. Fisher Statistical Methods for Research Workers , 1971 .

[13]  Momiao Xiong,et al.  Generalized T2 test for genome association studies. , 2002, American journal of human genetics.

[14]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[15]  Z. Šidák Rectangular Confidence Regions for the Means of Multivariate Normal Distributions , 1967 .

[16]  J. Ott,et al.  Trimming, weighting, and grouping SNPs in human case-control association studies. , 2001, Genome research.

[17]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[18]  M. Daly,et al.  High-resolution haplotype structure in the human genome , 2001, Nature Genetics.

[19]  B S Weir,et al.  Truncated product method for combining P‐values , 2002, Genetic epidemiology.

[20]  Frank Dudbridge,et al.  Haplotype tagging for the identification of common disease genes , 2001, Nature Genetics.

[21]  N. E. Morton,et al.  The first linkage disequilibrium (LD) maps: Delineation of hot and cold blocks by diplotype analysis , 2002, Proceedings of the National Academy of Sciences of the United States of America.