The Insignificance of Statistical Significance Testing

Despite their wide use in scientific journals such as The Journal of Wildlife Management, statistical hypothesis tests add very little value to the products of research. Indeed, they frequently confuse the interpretation of data. This paper describes how statistical hypothesis tests are often viewed, and then contrasts that interpretation with the correct one. I discuss the arbitrariness of P-values, conclusions that the null hypothesis is true, power analysis, and distinctions between statistical and biological significance. Statistical hypothesis testing, in which the null hypothesis about the properties of a population is almost always known a priori to be false, is contrasted with scientific hypothesis testing, which examines a credible null hypothesis about phenomena in nature. More meaningful alternatives are briefly outlined, including estimation and confidence intervals for determining the importance of factors, decision theory for guiding actions in the face of uncertainty, and Bayesian approaches to hypothesis testing and other statistical practices.

[1]  K. Popper,et al.  The Logic of Scientific Discovery , 1960 .

[2]  Joseph Berkson,et al.  Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test , 1938 .

[3]  M. Kendall,et al.  The Logic of Scientific Discovery. , 1959 .

[4]  Jum C. Nunnally,et al.  The Place of Statistics in Psychology , 1960 .

[5]  Cherry Ann Clark Hypothesis Testing in Relation to Statistical Methodology , 1963 .

[6]  F. Yates Sir Ronald Fisher and the Design of Experiments , 1964 .

[7]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[8]  John W. Tukey,et al.  Analyzing data: Sanctification or detective work? , 1969 .

[9]  Samuel A. Schmitt Measuring Uncertainty: An Elementary Introduction to Bayesian Statistics , 1969 .

[10]  M. Degroot Optimal Statistical Decisions , 1970 .

[11]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[12]  W E Deming,et al.  On probability as a basis for action. , 1975, Methods of information in medicine. Supplement.

[13]  W. W. Daniel Applied Nonparametric Statistics , 1979 .

[14]  R. P. Carver The Case Against Statistical Significance Testing , 1978 .

[15]  George E. P. Box,et al.  Sampling and Bayes' inference in scientific modelling and robustness , 1980 .

[16]  C. Toft,et al.  Detecting Community-Wide Patterns: Estimating Power Strengthens Statistical Inference , 1983, The American Naturalist.

[17]  James F. Quinn,et al.  On Hypothesis Testing in Ecology and Evolution , 1983, The American Naturalist.

[18]  L. Hedges,et al.  Statistical Methods for Meta-Analysis , 1987 .

[19]  Louis Guttman,et al.  The Illogic of Statistical Inference for Cumulative Science , 1984 .

[20]  Carl J. Walters,et al.  Adaptive Management of Renewable Resources , 1986 .

[21]  J. Berger,et al.  Testing Precise Hypotheses , 1987 .

[22]  J. Berger,et al.  Testing a Point Null Hypothesis: The Irreconcilability of P Values and Evidence , 1987 .

[23]  G. Newman,et al.  CONFIDENCE INTERVALS , 1987, The Lancet.

[24]  James O. Berger,et al.  Statistical Analysis and the Illusion of Objectivity , 1988 .

[25]  J. Berger Statistical Decision Theory and Bayesian Analysis , 1988 .

[26]  D. Simberloff Hypotheses, errors, and statistical assumptions , 1990 .

[27]  D. A. Preece,et al.  R. A. Fisher and Experimental Design: A Review , 1990 .

[28]  R. Peterman Statistical Power Analysis can Improve Fisheries Research and Management , 1990 .

[29]  Colin Howson,et al.  Bayesian reasoning in science , 1991, Nature.

[30]  Geoffrey R. Loftus,et al.  On the Tyranny of Hypothesis Testing in the Social Sciences , 1991 .

[31]  T. Bayes An essay towards solving a problem in the doctrine of chances , 2003 .

[32]  Carl J. Huberty,et al.  Historical Origins of Statistical Testing Practices: The Treatment of Fisher versus Neyman-Pearson Views in Textbooks. , 1993 .

[33]  James P. Shaver,et al.  What Statistical Significance Testing Is, and What It Is Not , 1993 .

[34]  Jacob Cohen The earth is round (p < .05) , 1994 .

[35]  Douglas H. Johnson STATISTICAL SIRENS: THE ALLURE OF NONPARAMETRICS' , 1995 .

[36]  H. Raiffa,et al.  Introduction to Statistical Decision Theory , 1996 .

[37]  Aaron M. Ellison,et al.  AN INTRODUCTION TO BAYESIAN INFERENCE FOR ECOLOGICAL RESEARCH AND ENVIRONMENTAL , 1996 .

[38]  Marks R. Nester,et al.  An Applied Statistician's Creed , 1996 .

[39]  Mitchell J. Small,et al.  Bayesian Environmental Policy Decisions: Two Case Studies , 1996 .

[40]  R. Matthews,et al.  FAITH, HOPE AND STATISTICS , 1997 .

[41]  K. Burnham,et al.  Model selection: An integral part of inference , 1997 .

[42]  A. J. Underwood,et al.  Experiments in Ecology: Their Logical Design and Interpretation Using Analysis of Variance , 1997 .

[43]  Robert J. Steidl,et al.  Statistical power analysis in wildlife research , 1997 .

[44]  Bradley P. Carlin,et al.  BAYES AND EMPIRICAL BAYES METHODS FOR DATA ANALYSIS , 1996, Stat. Comput..

[45]  Carl J. Walters,et al.  Valuation of experimental management options for ecological systems , 1997 .

[46]  James E. McLean,et al.  The Role of Statistical Significance Testing In Educational Research , 1998 .

[47]  P. Dayton,et al.  Reversal of the Burden of Proof in Fisheries Management , 1998, Science.

[48]  Patrick D. Gerard,et al.  Limits of retrospective power analysis , 1998 .

[49]  David R. Anderson,et al.  Model selection and inference : a practical information-theoretic approach , 2000 .