Estimation of False Discovery Rate Using Sequential Permutation p‐Values

We consider the problem of testing each of m null hypotheses with a sequential permutation procedure in which the number of draws from the permutation distribution of each test statistic is a random variable. Each sequential permutation p-value has a null distribution that is nonuniform on a discrete support. We show how to use a collection of such p-values to estimate the number of true null hypotheses m0 among the m null hypotheses tested and how to estimate the false discovery rate (FDR) associated with p-value significance thresholds. We use real data analyses and simulation studies to evaluate and illustrate the performance of our proposed approach relative to standard, more computationally intensive strategies. We find that our sequential approach produces similar results with far less computational expense in a variety of scenarios.

[1]  Marcelo Azevedo Costa,et al.  Power of the Sequential Monte Carlo Test , 2009 .

[2]  E. Lander,et al.  Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. , 1989, Genetics.

[3]  E. Lander,et al.  Identification of genetic factors contributing to heterosis in a hybrid from two elite maize inbred lines using molecular markers. , 1992, Genetics.

[4]  Dan Nettleton,et al.  Estimating the number of true null hypotheses from a histogram of p values , 2006 .

[5]  R. Stoughton,et al.  Genetics of gene expression surveyed in maize, mouse and man , 2003, Nature.

[6]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[7]  R W Doerge,et al.  Accounting for Variability in the Use of Permutation Testing to Detect Quantitative Trait Loci , 2000, Biometrics.

[8]  J. Besag,et al.  Sequential Monte Carlo p-values , 1991 .

[9]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[10]  L. Kruglyak,et al.  Genetic Dissection of Transcriptional Regulation in Budding Yeast , 2002, Science.

[11]  M. Soller,et al.  A whole genome scan for quantitative trait loci affecting milk protein percentage in Israeli-Holstein cattle, by means of selective milk DNA pooling in a daughter design, using an adjusted false discovery rate criterion. , 2001, Genetics.

[12]  Kun Liang,et al.  Adaptive and dynamic adaptive procedures for false discovery rate control and estimation , 2012 .

[13]  M. Moscou,et al.  Quantitative and Qualitative Stem Rust Resistance Factors in Barley Are Associated with Transcriptional Suppression of Defense Regulons , 2011, PLoS genetics.

[14]  Dario Campana,et al.  Integrated analysis of pharmacologic, clinical and SNP microarray data using Projection Onto the Most Interesting Statistical Evidence with Adaptive Permutation Testing , 2011, Int. J. Data Min. Bioinform..

[15]  B. Lindqvist,et al.  Estimating the proportion of true null hypotheses, with application to DNA microarray data , 2005 .

[16]  J. Nap,et al.  Genetical genomics: the added value from segregation. , 2001, Trends in genetics : TIG.

[17]  D. Ruppert,et al.  Exploring the Information in p‐Values for the Analysis and Planning of Multiple‐Test Experiments , 2007, Biometrics.

[18]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[19]  R. Doerge,et al.  Empirical threshold values for quantitative trait mapping. , 1994, Genetics.

[20]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[21]  John D. Storey A direct approach to false discovery rates , 2002 .