The appropriate use of null hypothesis testing.

The many criticisms of null hypothesis testing suggest when it is not useful and what is should not be used for. This article explores when and why its use is appropriate. Null hypothesis testing is insufficient when size of effect is important, but it is ideal for testing ordinal claims relating the order of conditions, which are common in psychology. Null hypothesis testing also is insufficient for determining beliefs, but it is ideal for demonstrating sufficient evidential strength to support an ordinal claim, with sufficient evidence being 1 criterion for a finding entering the corpus of legitimate findings in psychology. The line between sufficient and insufficient evidence is currently set at p < .05; there is little reason for allowing experimenters to select their own value of alpha. Thus null hypothesis testing is an optimal method for demonstrating sufficient evidence for an ordinal claim.

[1]  J F Osborn,et al.  Significance tests , 1989, British Dental Journal.

[2]  R. Frick Accepting the null hypothesis , 1995, Memory & cognition.

[3]  F. Samaniego,et al.  Toward a Reconciliation of the Bayesian and Frequentist Approaches to Point Estimation , 1994 .

[4]  M. Oakes Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[5]  E. S. Pearson,et al.  ON THE USE AND INTERPRETATION OF CERTAIN TEST CRITERIA FOR PURPOSES OF STATISTICAL INFERENCE PART I , 1928 .

[6]  M. Masson,et al.  Using confidence intervals in within-subject designs , 1994, Psychonomic bulletin & review.

[7]  Monica J. Harris Significance Tests are Not Enough , 1991 .

[8]  H. Eysenck,et al.  The concept of statistical significance and the controversy about one-tailed tests. , 1960, Psychological review.

[9]  H. Kaiser,et al.  Directional statistical decisions. , 1960, Psychological review.

[10]  R. Fowler Testing for substantive significance in applied research by specifying nonzero effect null hypotheses. , 1985 .

[11]  R. Giere THE SIGNIFICANCE TEST CONTROVERSY* , 1972, The British Journal for the Philosophy of Science.

[12]  Jum C. Nunnally,et al.  The Place of Statistics in Psychology , 1960 .

[13]  W. W. Rozeboom The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[14]  S. Chow Conceptual Rigor versus Practical Impact , 1991 .

[15]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[16]  Denton E. Morrison,et al.  Significance tests reconsidered. , 1969 .

[17]  Michael W. Levine,et al.  Fundamentals of sensation and perception , 1981 .

[18]  S. Chow Significance Test or Effect Size ? , 2022 .

[19]  L. J. Savage The Foundations of Statistical Inference. , 1963 .

[20]  Allen L. Edwards,et al.  Experimental Design in Psychological Research. , 1951 .

[21]  R. Falk,et al.  Significance Tests Die Hard , 1995 .

[22]  Kenneth Mullen,et al.  First Course in Probability and Statistics , 1973 .

[23]  A. Greenwald,et al.  Effect sizes and p values: what should be reported and what should be replicated? , 1996, Psychophysiology.

[24]  G. Glass,et al.  Meta-analysis in social research , 1981 .

[25]  E. S. Pearson,et al.  On the Problem of the Most Efficient Tests of Statistical Hypotheses , 1933 .

[26]  R. A. Weitzman,et al.  Seven Treacherous Pitfalls of Statistics, Illustrated , 1984 .

[27]  P. Lachenbruch Statistical Power Analysis for the Behavioral Sciences (2nd ed.) , 1989 .

[28]  L. M. M.-T. Theory of Probability , 1929, Nature.

[29]  D. A. Grant,et al.  Testing the null hypothesis and the strategy and tactics of investigating theoretical models. , 1962, Psychological review.

[30]  Theodor D. Sterling,et al.  Publication decisions revisited: the effect of the outcome of statistical tests on the decision to p , 1995 .

[31]  W. E. Hick A note on one-tailed and two-tailed tests. , 1952, Psychological review.

[32]  C. Hooker,et al.  Foundations of Probability Theory, Statistical Inference, and Statistical Theories of Science , 1976 .

[33]  D. Cox Some problems connected with statistical inference , 1958 .

[34]  D. O. Sears College sophomores in the laboratory: Influences of a narrow data base on social psychology's view of human nature. , 1986 .

[35]  Discussion D , 1981 .

[36]  B. Wolman,et al.  Handbook of clinical psychology , 1965 .

[37]  D. A. Sprott,et al.  On Tests of Significance , 1976 .

[38]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[39]  John E. Hunter,et al.  Methods of Meta-Analysis: Correcting Error and Bias in Research Findings , 1991 .

[40]  Jerzy Neyman,et al.  The testing of statistical hypotheses in relation to probabilities a priori , 1933, Mathematical Proceedings of the Cambridge Philosophical Society.

[41]  D. J. Johnstone,et al.  Tests of Significance Following R. A. Fisher1 , 1987, The British Journal for the Philosophy of Science.

[42]  Jacob Cohen The earth is round (p < .05) , 1994 .

[43]  Joseph Berkson,et al.  Some Difficulties of Interpretation Encountered in the Application of the Chi-Square Test , 1938 .

[44]  John T. E. Richardson Measures of effect size , 1996 .

[45]  George Sperling,et al.  The information available in brief visual presentations. , 1960 .

[46]  T. Cook,et al.  Quasi-experimentation: Design & analysis issues for field settings , 1979 .

[47]  Subjectivist Statistics for the Current Crisis. , 1961 .

[48]  David M. Johnstone Comments on Oakes on the Foundations of Statistical Inference in the Social and Behavioral Sciences: The Market for Statistical Significance , 1988 .

[49]  B. J. Winer Statistical Principles in Experimental Design , 1992 .

[50]  John Garcia Tilting at the Paper Mills of Academe , 1981 .