Lower bounds in multiple testing: A framework based on derandomized proxies

The large bulk of work in multiple testing has focused on specifying procedures that control the false discovery rate (FDR), with relatively less attention being paid to the corresponding Type II error known as the false non-discovery rate (FNR). A line of more recent work in multiple testing has begun to investigate the tradeoffs between the FDR and FNR and to provide lower bounds on the performance of procedures that depend on the model structure. Lacking thus far, however, has been a general approach to obtaining lower bounds for a broad class of models. This paper introduces an analysis strategy based on derandomization, illustrated by applications to various concrete models. Our main result is meta-theorem that gives a general recipe for obtaining lower bounds on the combination of FDR and FNR. We illustrate this meta-theorem by deriving explicit bounds for several models, including instances with dependence, scale-transformed alternatives, and non-Gaussian-like distributions. We provide numerical simulations of some of these lower bounds, and show a close relation to the actual performance of the Benjamini-Hochberg (BH) algorithm.

[1]  D. Donoho,et al.  Higher criticism for detecting sparse heterogeneous mixtures , 2004, math/0410072.

[2]  Alexandre B. Tsybakov,et al.  Introduction to Nonparametric Estimation , 2008, Springer series in statistics.

[3]  Y. Benjamini,et al.  Multiple Hypotheses Testing with Weights , 1997 .

[4]  L. Wasserman,et al.  Operating characteristics and extensions of the false discovery rate procedure , 2002 .

[5]  L. Wasserman,et al.  False discovery control with p-value weighting , 2006 .

[6]  Isaac Dialsingh,et al.  Large-scale inference: empirical Bayes methods for estimation, testing, and prediction , 2012 .

[7]  Dean P. Foster,et al.  α‐investing: a procedure for sequential control of expected false discoveries , 2008 .

[8]  S. Boucheron,et al.  Concentration inequalities for order statistics , 2012, 1207.7209.

[9]  Christian P. Robert,et al.  Large-scale inference , 2010 .

[10]  John D. Storey A direct approach to false discovery rates , 2002 .

[11]  Jiashun Jin,et al.  Rare and Weak effects in Large-Scale Inference: methods and phase diagrams , 2014, 1410.4578.

[12]  Y. Benjamini,et al.  Controlling the false discovery rate: a practical and powerful approach to multiple testing , 1995 .

[13]  John D. Storey,et al.  Strong control, conservative point estimation and simultaneous conservative consistency of false discovery rates: a unified approach , 2004 .

[14]  Jiashun Jin,et al.  UPS delivers optimal phase diagram in high-dimensional variable selection , 2010, 1010.5028.

[15]  E. Arias-Castro,et al.  Distribution-free Multiple Testing , 2016, 1604.07520.

[16]  Zhigen Zhao,et al.  Rate optimal multiple testing procedure in high-dimensional regression , 2014, 1404.2961.

[17]  Martin J. Wainwright,et al.  High-Dimensional Statistics , 2019 .

[18]  E. Candès,et al.  Controlling the false discovery rate via knockoffs , 2014, 1404.5609.

[19]  Michael I. Jordan,et al.  A unified treatment of multiple testing with prior knowledge using the p-filter , 2017, The Annals of Statistics.

[20]  Shahar Mendelson,et al.  Gaussian averages of interpolated bodies and applications to approximate reconstruction , 2007, J. Approx. Theory.

[21]  Y. Benjamini,et al.  THE CONTROL OF THE FALSE DISCOVERY RATE IN MULTIPLE TESTING UNDER DEPENDENCY , 2001 .

[22]  Étienne Roquain,et al.  On false discovery rate thresholding for classification under sparsity , 2011, 1106.6147.

[23]  Aaditya Ramdas,et al.  The p‐filter: multilayer false discovery rate control for grouped hypotheses , 2017 .

[24]  Jiashun Jin,et al.  Higher Criticism for Large-Scale Inference: especially for Rare and Weak effects , 2014, 1410.4743.