Estimating false discovery rates for contingency tables

When testing a large number of hypotheses, it can be helpful to estimate or control the false discovery rate (FDR), the expected proportion of tests called significant that are truly null. The FDR is intricately linked to probability that a truly null test is significant, and thus a number of methods have been described that estimate or control the FDR by directly using the p-values of the hypothesis tests. Most of these methods make the assumption that the p-values are uniformly and continuously distributed under the null hypothesis, an assumption that often does not hold for finite data. In this paper, we consider the estimation of FDR for contingency tables. We show how Fisher’s exact test can be extended to efficiently calculate the exact null distribution over a set of contingency tables. Using this exact null distribution, we explore the estimation of each of the terms in the FDR estimation, characterize the asymptotic convergence of the estimator, and show how the conservative bias can be reduced by removing certain tests from consideration. The resulting estimator has substantially less conservative bias than traditional approaches.

[1]  Sonja W. Scholz,et al.  A two-stage genome-wide association study of sporadic amyotrophic lateral sclerosis. , 2009, Human molecular genetics.

[2]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[3]  L. Wasserman,et al.  A stochastic process approach to false discovery control , 2004, math/0406519.

[4]  D. Strick,et al.  Comprehensive Epitope Analysis of Human Immunodeficiency Virus Type 1 (HIV-1)-Specific T-Cell Responses Directed against the Entire Expressed HIV-1 Genome Demonstrate Broadly Directed Responses, but No Correlation to Viral Load , 2003, Journal of Virology.

[5]  R. Fisher On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2018, Journal of the Royal Statistical Society Series A (Statistics in Society).

[6]  Thierry Moreau,et al.  A simple procedure for estimating the false discovery rate , 2005, Bioinform..

[7]  Cheng Cheng,et al.  Robust estimation of the false discovery rate , 2006, Bioinform..

[8]  Wei Pan,et al.  Gene expression A note on using permutation-based false discovery rate estimates to compare different analysis methods for microarray data , 2005 .

[9]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[10]  David Heckerman,et al.  Human leukocyte antigen-specific polymorphisms in HIV-1 Gag and their association with viral load in chronic untreated infection , 2008, AIDS.

[11]  Shuo Jiao,et al.  On correcting the overestimation of the permutation-based false discovery rate estimator , 2008, Bioinform..

[12]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[13]  B. Lindqvist,et al.  Estimating the proportion of true null hypotheses, with application to DNA microarray data , 2005 .

[14]  David Heckerman,et al.  CD8+ T-cell responses to different HIV proteins have discordant associations with viral load , 2007, Nature Medicine.

[15]  John D. Storey A direct approach to false discovery rates , 2002 .

[16]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[17]  P. Harrigan,et al.  Predictors of HIV drug-resistance mutations in a large antiretroviral-naive cohort initiating triple antiretroviral therapy. , 2005, The Journal of infectious diseases.

[18]  Peter B. Gilbert,et al.  A modified false discovery rate multiple‐comparisons procedure for discrete data, applied to human immunodeficiency virus genetics , 2005 .

[19]  Stan Pounds,et al.  False discovery rate paradigms for statistical analyses of microarray gene expression data , 2007, Bioinformation.

[20]  R. Fisher On the Interpretation of χ2 from Contingency Tables, and the Calculation of P , 2010 .

[21]  Tanmoy Bhattacharya,et al.  HLA Class I-Driven Evolution of Human Immunodeficiency Virus Type 1 Subtype C Proteome: Immune Escape and Viral Load , 2008, Journal of Virology.