I can see clearly now: Reinterpreting statistical significance

Null hypothesis significance testing remains popular despite decades of concern about misuse and misinterpretation. We believe that much of the problem is due to language: significance testing has little to do with other meanings of the word "significance". Despite the limitations of null-hypothesis tests, we argue here that they remain useful in many contexts as a guide to whether a certain effect can be seen clearly in that context (e.g. whether we can clearly see that a correlation or between-group difference is positive or negative). We therefore suggest that researchers describe the conclusions of null-hypothesis tests in terms of statistical "clarity" rather than statistical "significance". This simple semantic change could substantially enhance clarity in statistical communication.

[1]  J. Ridley,et al.  An unexpected influence of widely used significance thresholds on the distribution of reported P‐values , 2007, Journal of evolutionary biology.

[2]  J. Ioannidis,et al.  When Null Hypothesis Significance Testing Is Unsuitable for Research: A Reassessment , 2016, bioRxiv.

[3]  J. Carlin,et al.  Beyond Power Calculations , 2014, Perspectives on psychological science : a journal of the Association for Psychological Science.

[4]  Christopher D. Chambers,et al.  Redefine statistical significance , 2017, Nature Human Behaviour.

[5]  Fabrizio Bernardi,et al.  ‘Sing Me a Song with Social Significance’: The (Mis)Use of Statistical Significance Testing in European Sociological Research , 2016 .

[6]  W. Krämer The Cult of Statistical Signi ficance - What Economists Should and Should Not Do To Make Their Data Talk , 2010, SSRN Electronic Journal.

[7]  H. Wainer,et al.  ON THE PAST AND FUTURE OF NULL HYPOTHESIS SIGNIFICANCE TESTING1 , 2001 .

[8]  D. Heisey,et al.  The Abuse of Power , 2001 .

[9]  Howard Wainer,et al.  On the Past and Future of Null Hypothesis Significance Testing , 2002 .

[10]  J. Tukey The Philosophy of Multiple Comparisons , 1991 .

[11]  B. The Significance testing – are we ready yet to abandon its use? , 2011, Current medical research and opinion.

[12]  Bin Yu,et al.  Ten Simple Rules for Effective Statistical Practice , 2016, PLoS Comput. Biol..

[13]  David Gal,et al.  Abandon Statistical Significance , 2017, The American Statistician.

[14]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[15]  S. Goodman Toward Evidence-Based Medical Statistics. 1: The P Value Fallacy , 1999, Annals of Internal Medicine.

[16]  N. Lazar,et al.  The ASA Statement on p-Values: Context, Process, and Purpose , 2016 .

[17]  Kent W. Staley Pragmatic warrant for frequentist statistical practice: the case of high energy physics , 2016, Synthese.

[18]  P. Meehl Why Summaries of Research on Psychological Theories are Often Uninterpretable , 1990 .

[19]  A. Gelman,et al.  The statistical crisis in science , 2014 .

[20]  A. Gelman,et al.  The Difference Between “Significant” and “Not Significant” is not Itself Statistically Significant , 2006 .

[21]  Robert P. Abelson,et al.  On the Surprising Longevity of Flogged Horses: Why There Is a Case for the Significance Test , 1997 .

[22]  P. Good,et al.  Permutation Tests: A Practical Guide to Resampling Methods for Testing Hypotheses , 1995 .

[23]  L. V. Jones,et al.  A sensible formulation of the significance test. , 2000, Psychological methods.

[24]  Jacob Cohen The earth is round (p < .05) , 1994 .

[25]  D. Mccloskey,et al.  The Cult of Statistical Significance: How the Standard Error Costs Us Jobs, Justice, and Lives , 2008 .