E-values: Calibration, combination, and applications

Multiple testing of a single hypothesis and testing multiple hypotheses are usually done in terms of p-values. In this paper, we replace p-values with their natural competitor, e-values, which are closely related to betting, Bayes factors and likelihood ratios. We demonstrate that e-values are often mathematically more tractable; in particular, in multiple testing of a single hypothesis, e-values can be merged simply by averaging them. This allows us to develop efficient procedures using e-values for testing multiple hypotheses.

[1]  Bin Wang,et al.  Aggregation-robustness and model uncertainty of regulatory risk measures , 2015, Finance Stochastics.

[2]  Per Martin-Löf,et al.  The Definition of Random Sequences , 1966, Inf. Control..

[3]  K. Gabriel,et al.  On closed testing procedures with special reference to ordered analysis of variance , 1976 .

[4]  Andrei N. Kolmogorov,et al.  Logical basis for information theory and probability theory , 1968, IEEE Trans. Inf. Theory.

[5]  Nicolai Meinshausen Discussion of \Multiple Testing for Exploratory Research" , 2011 .

[6]  R. Jackson Inequalities , 2007, Algebra for Parents.

[7]  Wouter M. Koolen,et al.  Safe Testing , 2019, 2020 Information Theory and Applications Workshop (ITA).

[8]  John D. Storey,et al.  Statistical significance for genomewide studies , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[9]  J. Shaffer Modified Sequentially Rejective Multiple Test Procedures , 1986 .

[10]  Vladimir Vovk,et al.  Game‐Theoretic Foundations for Probability and Finance , 2019, Wiley Series in Probability and Statistics.

[11]  Benjamin Naumann,et al.  Classical Descriptive Set Theory , 2016 .

[12]  S. P. Wright,et al.  Adjusted P-values for simultaneous inference , 1992 .

[13]  V. Vovk A logic of probability, with application to the foundations of statistics , 1993 .

[14]  Abraham Wald,et al.  Statistical Decision Functions , 1951 .

[15]  A. Kolmogorov Three approaches to the quantitative definition of information , 1968 .

[16]  U. Rieder Measurable selection theorems for optimization problems , 1978 .

[17]  V. Vovk,et al.  A class of ie-merging functions , 2020 .

[18]  Larry Wasserman,et al.  Universal inference , 2019, Proceedings of the National Academy of Sciences.

[19]  Alexander Etz,et al.  J. B. S. Haldane's Contribution to the Bayes Factor Hypothesis Test , 2015, 1511.08180.

[20]  V. Vovk,et al.  Admissible ways of merging p-values under arbitrary dependence , 2020, The Annals of Statistics.

[21]  Jean-Luc Ville Étude critique de la notion de collectif , 1939 .

[22]  Daniel J. Wilson,et al.  The harmonic mean p-value for combining dependent tests , 2019, Proceedings of the National Academy of Sciences.

[23]  G. Shafer,et al.  Test Martingales, Bayes Factors and p-Values , 2009, 0912.4269.

[24]  S. Holm A Simple Sequentially Rejective Multiple Test Procedure , 1979 .

[25]  H. Föllmer,et al.  Stochastic Finance: An Introduction in Discrete Time , 2002 .

[26]  M. J. Bayarri,et al.  Calibration of ρ Values for Testing Precise Null Hypotheses , 2001 .

[27]  B. Rüger Das maximale signifikanzniveau des Tests: “LehneHo ab, wennk untern gegebenen tests zur ablehnung führen” , 1978 .

[28]  G. Shafer The Language of Betting as a Strategy for Statistical and Scientific Communication , 2019, 1903.06991.

[29]  J. Berger,et al.  Three Recommendations for Improving the Use of p-Values , 2019, The American Statistician.

[30]  C. Robert,et al.  Testing hypotheses via a mixture estimation model , 2014, 1412.2044.

[31]  HENRY STEINITZ,et al.  KOLMOGOROV COMPLEXITY AND ALGORITHMIC RANDOMNESS , 2013 .

[32]  Isaac Dialsingh,et al.  Large-scale inference: empirical Bayes methods for estimation, testing, and prediction , 2012 .

[33]  R. Simes,et al.  An improved Bonferroni procedure for multiple tests of significance , 1986 .

[34]  J. Goeman,et al.  Multiple Testing for Exploratory Research , 2011, 1208.2841.

[35]  Thomas E. Nichols,et al.  The harmonic mean p-value: Strong versus weak control, and the assumption of independence , 2019, Proceedings of the National Academy of Sciences.

[36]  L. M. M.-T. Theory of Probability , 1929, Nature.

[37]  Jasjeet S. Sekhon,et al.  Time-uniform, nonparametric, nonasymptotic confidence sequences , 2020, The Annals of Statistics.

[38]  M. Kendall Statistical Methods for Research Workers , 1937, Nature.

[39]  Aaditya Ramdas,et al.  Interactive martingale tests for the global null , 2019, 1909.07339.