POWER-ENHANCED MULTIPLE DECISION FUNCTIONS CONTROLLING FAMILY-WISE ERROR AND FALSE DISCOVERY RATES.

Improved procedures, in terms of smaller missed discovery rates (MDR), for performing multiple hypotheses testing with weak and strong control of the family-wise error rate (FWER) or the false discovery rate (FDR) are developed and studied. The improvement over existing procedures such as the Šidák procedure for FWER control and the Benjamini-Hochberg (BH) procedure for FDR control is achieved by exploiting possible differences in the powers of the individual tests. Results signal the need to take into account the powers of the individual tests and to have multiple hypotheses decision functions which are not limited to simply using the individual p-values, as is the case, for example, with the Šidák, Bonferroni, or BH procedures. They also enhance understanding of the role of the powers of individual tests, or more precisely the receiver operating characteristic (ROC) functions of decision processes, in the search for better multiple hypotheses testing procedures. A decision-theoretic framework is utilized, and through auxiliary randomizers the procedures could be used with discrete or mixed-type data or with rank-based nonparametric tests. This is in contrast to existing p-value based procedures whose theoretical validity is contingent on each of these p-value statistics being stochastically equal to or greater than a standard uniform variable under the null hypothesis. Proposed procedures are relevant in the analysis of high-dimensional "large M, small n" data sets arising in the natural, physical, medical, economic and social sciences, whose generation and creation is accelerated by advances in high-throughput technology, notably, but not limited to, microarray technology.

[1]  Edsel A Peña,et al.  Randomised P-values and nonparametric procedures in multiple testing , 2011, Journal of nonparametric statistics.

[2]  P. Müller,et al.  A Bayesian discovery procedure , 2009, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[3]  Étienne Roquain,et al.  Optimal weighting for false discovery rate control , 2008, 0807.4081.

[4]  D. Allison,et al.  Statistical Applications in Genetics and Molecular Biology Weighted Multiple Hypothesis Testing Procedures , 2011 .

[5]  Kellen Petersen August Real Analysis , 2009 .

[6]  James F Troendle,et al.  Multiple Testing with Minimal Assumptions , 2008, Biometrical journal. Biometrische Zeitschrift.

[7]  Sandrine Dudoit,et al.  Resampling-based empirical Bayes multiple testing procedures for controlling generalized tail probability and expected value error rates: focus on the false discovery rate and simulation study. , 2008, Biometrical journal. Biometrische Zeitschrift.

[8]  Haavard Rue,et al.  Unsupervised empirical Bayesian multiple testing with external covariates , 2008, 0807.4658.

[9]  Dean P. Foster,et al.  α‐investing: a procedure for sequential control of expected false discoveries , 2008 .

[10]  Sanat K. Sarkar,et al.  Generalizing Simes' test and Hochberg's stepup procedure , 2008, 0803.1961.

[11]  B. Efron SIMULTANEOUS INFERENCE : WHEN SHOULD HYPOTHESIS TESTING PROBLEMS BE COMBINED? , 2008, 0803.3863.

[12]  Bradley Efron,et al.  Microarrays, Empirical Bayes and the Two-Groups Model. Rejoinder. , 2008, 0808.0572.

[13]  M. Newton Large-Scale Simultaneous Hypothesis Testing: The Choice of a Null Hypothesis , 2008 .

[14]  Debashis Ghosh,et al.  A GENERAL DECISION THEORETIC FORMULATION OF PROCEDURES CONTROLLING FDR AND FNR FROM A BAYESIAN PERSPECTIVE , 2008 .

[15]  S. Dudoit,et al.  Multiple Testing Procedures with Applications to Genomics , 2007 .

[16]  Wenguang Sun,et al.  Oracle and Adaptive Compound Decision Rules for False Discovery Rate Control , 2007 .

[17]  B. Efron Size, power and false discovery rates , 2007, 0710.2245.

[18]  John D. Storey The optimal discovery procedure: a new approach to simultaneous significance testing , 2007 .

[19]  Jeffrey T Leek,et al.  The optimal discovery procedure for large-scale significance testing, with applications to comparative microarray experiments. , 2007, Biostatistics.

[20]  F. Subotsky The Strange Case of Dr Jekyll and Mr Hyde , 2007, BMJ : British Medical Journal.

[21]  T. Cai,et al.  Estimating the Null and the Proportion of Nonnull Effects in Large-Scale Multiple Comparisons , 2006, math/0611108.

[22]  L. Wasserman,et al.  False discovery control with p-value weighting , 2006 .

[23]  James G. Scott,et al.  An exploration of aspects of Bayesian multiple testing , 2006 .

[24]  L. Wasserman,et al.  Weighted Hypothesis Testing , 2006, math/0604172.

[25]  Sandrine Dudoit,et al.  Statistical Applications in Genetics and Molecular Biology A Method to Increase the Power of Multiple Testing Procedures Through Sample Splitting , 2011 .

[26]  B. Lindqvist,et al.  Estimating the proportion of true null hypotheses, with application to DNA microarray data , 2005 .

[27]  Joseph P. Romano,et al.  On optimality of stepdown and stepup multiple test procedures , 2005, math/0507417.

[28]  R. Dougherty,et al.  Cross‐subject comparison of principal diffusion direction maps , 2005, Magnetic resonance in medicine.

[29]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[30]  C. Robert,et al.  Optimal Sample Size for Multiple Testing : the Case of Gene Expression Mi roarraysPeter , 2004 .

[31]  B. Efron Large-Scale Simultaneous Hypothesis Testing , 2004 .

[32]  John D. Storey The positive false discovery rate: a Bayesian interpretation and the q-value , 2003 .

[33]  S. Dudoit,et al.  Multiple Hypothesis Testing in Microarray Experiments , 2003 .

[34]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[35]  John D. Storey A direct approach to false discovery rates , 2002 .

[36]  E. Lander,et al.  Gene expression correlates of clinical prostate cancer behavior. , 2002, Cancer cell.

[37]  John D. Storey,et al.  Empirical Bayes Analysis of a Microarray Experiment , 2001 .

[38]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[39]  Y. Benjamini,et al.  On the Adaptive Control of the False Discovery Rate in Multiple Testing With Independent Statistics , 2000 .

[40]  P H Westfall,et al.  Using prior information to allocate significance levels for multiple endpoints. , 1998, Statistics in medicine.

[41]  S. Sarkar Some probability inequalities for ordered $\rm MTP\sb 2$ random variables: a proof of the Simes conjecture , 1998 .

[42]  Ross Ihaka,et al.  Gentleman R: R: A language for data analysis and graphics , 1996 .

[43]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[44]  S. S. Young,et al.  Resampling-Based Multiple Testing: Examples and Methods for p-Value Adjustment , 1993 .

[45]  C. Stein,et al.  Estimation with Quadratic Loss , 1992 .

[46]  B. Sorić Statistical “Discoveries” and Effect-Size Estimation , 1989 .

[47]  K. K. Lan,et al.  Discrete sequential boundaries for clinical trials , 1983 .

[48]  E. Spjøtvoll,et al.  Plots of P-values to evaluate many tests simultaneously , 1982 .

[49]  I. Olkin,et al.  Inequalities: Theory of Majorization and Its Applications , 1980 .

[50]  E. Spjøtvoll On the Optimality of Some Multiple Comparison Procedures , 1972 .

[51]  Z. Šidák Rectangular Confidence Regions for the Means of Multivariate Normal Distributions , 1967 .

[52]  W. Hoeffding On the Distribution of the Number of Successes in Independent Trials , 1956 .

[53]  J. Doob Stochastic processes , 1953 .

[54]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .