Stratified false discovery control for large‐scale hypothesis testing with application to genome‐wide association studies

The multiplicity problem has become increasingly important in genetic studies as the capacity for high‐throughput genotyping has increased. The control of False Discovery Rate (FDR) (Benjamini and Hochberg. [1995] J. R. Stat. Soc. Ser. B 57:289–300) has been adopted to address the problems of false positive control and low power inherent in high‐volume genome‐wide linkage and association studies. In many genetic studies, there is often a natural stratification of the m hypotheses to be tested. Given the FDR framework and the presence of such stratification, we investigate the performance of a stratified false discovery control approach (i.e. control or estimate FDR separately for each stratum) and compare it to the aggregated method (i.e. consider all hypotheses in a single stratum). Under the fixed rejection region framework (i.e. reject all hypotheses with unadjusted p‐values less than a pre‐specified level and then estimate FDR), we demonstrate that the aggregated FDR is a weighted average of the stratum‐specific FDRs. Under the fixed FDR framework (i.e. reject as many hypotheses as possible and meanwhile control FDR at a pre‐specified level), we specify a condition necessary for the expected total number of true positives under the stratified FDR method to be equal to or greater than that obtained from the aggregated FDR method. Application to a recent Genome‐Wide Association (GWA) study by Maraganore et al. ([2005] Am. J. Hum. Genet. 77:685–693) illustrates the potential advantages of control or estimation of FDR by stratum. Our analyses also show that controlling FDR at a low rate, e.g. 5% or 10%, may not be feasible for some GWA studies. Genet. Epidemiol. 2006. © 2006 Wiley‐Liss, Inc.

[1]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[2]  Larry Wasserman,et al.  Using linkage genome scans to improve power of association in genome scans. , 2006, American journal of human genetics.

[3]  Y. Benjamini,et al.  Quantitative Trait Loci Analysis Using the False Discovery Rate , 2005, Genetics.

[4]  Mariza de Andrade,et al.  High-resolution whole-genome association study of Parkinson disease. , 2005, American journal of human genetics.

[5]  Shuying S Li,et al.  FDR‐controlling testing procedures and sample size determination for microarrays , 2005, Statistics in medicine.

[6]  S. Scherer,et al.  Genetic variation at the ACE gene is associated with persistent microalbuminuria and severe nephropathy in type 1 diabetes: the DCCT/EDIC Genetics Study. , 2005, Diabetes.

[7]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[8]  John D. Storey A direct approach to false discovery rates , 2002 .

[9]  Radu V. Craiu,et al.  CHOOSING THE LESSER EVIL: TRADE-OFF BETWEEN FALSE DISCOVERY RATE AND NON-DISCOVERY RATE , 2008 .

[10]  L. Wasserman,et al.  False discovery control with p-value weighting , 2006 .

[11]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[12]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[13]  J. Pritchard Are rare variants responsible for susceptibility to complex diseases? , 2001, American journal of human genetics.

[14]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[15]  B. Efron Correlation and Large-Scale Simultaneous Significance Testing , 2007 .

[16]  D J Schaid,et al.  Use of parents, sibs, and unrelated controls for detection of associations between genetic markers and disease. , 1998, American journal of human genetics.

[17]  E. Lander,et al.  Genetic dissection of complex traits: guidelines for interpreting and reporting linkage results , 1995, Nature Genetics.

[18]  Chiara Sabatti,et al.  False discovery rate in linkage and association genome screens for complex disorders. , 2003, Genetics.

[19]  Larry Wasserman,et al.  A Large-sample Approach to Controlling False Discovery Rates , 2022 .

[20]  Y. Benjamini,et al.  Resampling-based false discovery rate controlling multiple test procedures for correlated test statistics , 1999 .

[21]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[22]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .