The Bayesian Methodology of Sir Harold Jeffreys as a Practical Alternative to the P Value Hypothesis Test

Despite an ongoing stream of lamentations, many empirical disciplines still treat the p value as the sole arbiter to separate the scientific wheat from the chaff. The continued reign of the p value is arguably due in part to a perceived lack of workable alternatives. In order to be workable, any alternative methodology must be (1) relevant: it has to address the practitioners’ research question, which—for better or for worse—most often concerns the test of a hypothesis, and less often concerns the estimation of a parameter; (2) available: it must have a concrete implementation for practitioners’ statistical workhorses such as the t test, regression, and ANOVA; and (3) easy to use: methods that demand practitioners switch to the theoreticians’ programming tools will face an uphill struggle for adoption. The above desiderata are fulfilled by Harold Jeffreys’s Bayes factor methodology as implemented in the open-source software JASP. We explain Jeffreys’s methodology and showcase its practical relevance with two examples.

[1]  Eric-Jan Wagenmakers,et al.  Replication Bayes factors from evidence updating , 2018, Behavior Research Methods.

[2]  W. Johnson,et al.  The Bayesian Two-Sample t Test , 2005 .

[3]  Alexander Etz,et al.  J. B. S. Haldane's Contribution to the Bayes Factor Hypothesis Test , 2015, 1511.08180.

[4]  D. Madigan,et al.  Bayesian Model Averaging for Linear Regression Models , 1997 .

[5]  L. M. M.-T. Theory of Probability , 1929, Nature.

[6]  N. Lazar,et al.  Moving to a World Beyond “p < 0.05” , 2019, The American Statistician.

[7]  I. Ehrlich Participation in Illegitimate Activities: A Theoretical and Empirical Investigation , 1973, Journal of Political Economy.

[8]  E. Wagenmakers,et al.  Harold Jeffreys’s default Bayes factor hypothesis tests: Explanation, extension, and application in psychology , 2016 .

[9]  Eric-Jan Wagenmakers,et al.  Retire significance, but still test hypotheses , 2019, Nature.

[10]  Wouter M. Koolen,et al.  Safe Testing , 2019, 2020 Information Theory and Applications Workshop (ITA).

[11]  Jeffrey N. Rouder,et al.  Bayesian t tests for accepting and rejecting the null hypothesis , 2009, Psychonomic bulletin & review.

[12]  Eric-Jan Wagenmakers,et al.  An evaluation of alternative methods for testing hypotheses, from the perspective of Harold Jeffreys , 2016 .

[13]  M. Clyde,et al.  Mixtures of g Priors for Bayesian Variable Selection , 2008 .

[14]  James M. Joyce Interpreting Probability: Controversies and Developments in the Early Twentieth Century , 2004 .

[15]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[16]  Keming Yu,et al.  Bayesian Mode Regression , 2012, 1208.0579.

[17]  V. Johnson,et al.  On the use of non‐local prior densities in Bayesian hypothesis tests , 2010 .

[18]  E. Wagenmakers,et al.  Bayesian Reanalyses from Summary Statistics : A Guide for Academic Consumers , 2018 .

[19]  Eric-Jan Wagenmakers,et al.  Informed Bayesian t-Tests , 2017, The American Statistician.

[20]  C. Robert,et al.  Harold Jeffreys’s Theory of Probability Revisited , 2008, 0804.3173.

[21]  Rolf A. Zwaan,et al.  Registered Replication Report , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[22]  Luis Carrasco,et al.  Different Brain Regions are Infected with Fungi in Alzheimer’s Disease , 2015, Scientific Reports.

[23]  Eric-Jan Wagenmakers,et al.  Analytic posteriors for Pearson's correlation coefficient , 2015, Statistica Neerlandica.

[24]  D. W. D.Sc.,et al.  XLII. On certain fundamental principles of scientific inquiry , 1921 .

[25]  M. J. Bayarri,et al.  Criteria for Bayesian model choice with application to variable selection , 2012, 1209.5240.

[26]  Merlise A. Clyde,et al.  Mixtures of g-Priors in Generalized Linear Models , 2015, Journal of the American Statistical Association.

[27]  J. Dickey The Weighted Likelihood Ratio, Linear Hypotheses on Normal Location Parameters , 1971 .

[28]  G. Cumming The New Statistics: Why and How , 2013 .

[29]  Jeffrey N. Rouder,et al.  Default Bayes factors for ANOVA designs , 2012 .

[30]  Jie W Weiss,et al.  Bayesian Statistical Inference for Psychological Research , 2008 .

[31]  James G. Scott,et al.  Bayes and empirical-Bayes multiplicity adjustment in the variable-selection problem , 2010, 1011.2333.

[32]  Rolf A. Zwaan,et al.  Registered Replication Report , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[33]  D. W. D.Sc.,et al.  XXXVII. On certain fundamental principles of scientific inquiry. (Second Paper) , 1923 .

[34]  Peter Grünwald,et al.  Optional Stopping with Bayes Factors: a categorization and extension of folklore results, with an application to invariant situations , 2018, ArXiv.

[35]  Lorne Campbell,et al.  Registered Replication Report , 2016, Perspectives on psychological science : a journal of the Association for Psychological Science.

[36]  F. Strack,et al.  Inhibiting and facilitating conditions of the human smile: a nonobtrusive test of the facial feedback hypothesis. , 1988, Journal of personality and social psychology.

[37]  G. Pólya,et al.  Mathematics and Plausible Reasoning: Vol. I: Induction and Analogy in Mathematics , 1979 .

[38]  Dorothy Wrinch,et al.  LXXV. On some aspects of the theory of probability , 1919 .

[39]  R Core Team,et al.  R: A language and environment for statistical computing. , 2014 .

[40]  A. Zellner,et al.  Posterior odds ratios for selected regression hypotheses , 1980 .