On the Past and Future of Null Hypothesis Significance Testing

Recent criticisms of null hypothesis significance testing (NHST) have appeared in wildlife research journals (Cherry 1998; Johnson 1999; Anderson et al. 2000, 2001; Guthery et al. 2001). In this essay, we discuss these criticisms with regard to both current usage of NHST and plausible future use. We suggest that the historical use of such procedures was reasonable and that current users might spend time profitably reading some of Fisher's applied work. However, modifications to NHST, and to the interpretations of its outcomes, might better suit the needs of modern science. Our primary conclusion is that NHST most often is useful as an adjunct to other results (e.g., effect sizes) rather than as a stand-alone result. We cite some examples, however, where NHST can be profitably used alone. Last, we find considerable experimental support for a less dogmatic attitude toward the interpretation of the probability yielded from such procedures.

[1]  G. C. Tiao,et al.  Bayesian inference in statistical analysis , 1973 .

[2]  John W. Tukey,et al.  Analyzing data: Sanctification or detective work? , 1969 .

[3]  J. Tukey The Philosophy of Multiple Comparisons , 1991 .

[4]  Rory A. Fisher,et al.  The Arrangement of Field Experiments , 1992 .

[5]  L. V. Jones,et al.  A sensible formulation of the significance test. , 2000, Psychological methods.

[6]  Howard Wainer,et al.  One cheer for null hypothesis significance testing. , 1999 .

[7]  David R. Anderson,et al.  Suggestions for presenting the results of data analyses , 2001 .

[8]  Douglas H. Johnson The Insignificance of Statistical Significance Testing , 1999 .

[9]  Ronald Aylmer Sir Fisher,et al.  079: The Statistical Method in Psychical Research. , 1929 .

[10]  J. H. Steiger,et al.  The Problem Is Epistemology, Not Statistics: Replace Significance Tests by Confidence Intervals and Quantify Accuracy of Risky Numerical Predictions , 2002 .

[11]  R. Frick,et al.  The appropriate use of null hypothesis testing. , 1996 .

[12]  L. M. M.-T. Theory of Probability , 1929, Nature.

[13]  P. Jolicoeur Tests of goodness of fit , 1999 .

[14]  H. Kaiser A second generation little jiffy , 1970 .

[15]  Steve Cherry,et al.  STATISTICAL TESTS IN PUBLICATIONS OF THE WILDLIFE SOCIETY , 1998 .

[16]  Robert L. Winkler,et al.  Bayesian statistics: An overview. , 1993 .

[17]  Markus J. Peterson,et al.  The fall of the null hypothesis: Liabilities and opportunities , 2001 .

[18]  R. Abelson Statistics As Principled Argument , 1995 .

[19]  F. Schmidt Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[20]  M. R. Novick,et al.  Statistical methods for educational and psychological research , 1976 .

[21]  J. Andel Sequential Analysis , 2022, The SAGE Encyclopedia of Research Design.

[22]  Rory A. Fisher,et al.  Theory of Statistical Estimation , 1925, Mathematical Proceedings of the Cambridge Philosophical Society.

[23]  L. A. Marascuilo,et al.  Appropriate Post Hoc Comparisons for Interaction and Nested Hypotheses in Analysis of Variance Designs: The Elimination of Type IV Errors1 , 1970 .

[24]  David R. Anderson,et al.  Null Hypothesis Testing: Problems, Prevalence, and an Alternative , 2000 .

[25]  G. Box,et al.  On the Experimental Attainment of Optimum Conditions , 1951 .

[26]  A. Tversky,et al.  Judgment under Uncertainty: Heuristics and Biases , 1974, Science.

[27]  B. Thompson Research news and Comment: AERA Editorial Policies Regarding Statistical Significance Testing: Three Suggested Reforms , 1996 .

[28]  J. Levin,et al.  Reflections on Statistical and Substantive Significance, with a Slice of Replication. , 1997 .

[29]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .