Effect sizes and p values: what should be reported and what should be replicated?

Despite publication of many well-argued critiques of null hypothesis testing (NHT), behavioral science researchers continue to rely heavily on this set of practices. Although we agree with most critics' catalogs of NHT's flaws, this article also takes the unusual stance of identifying virtues that may explain why NHT continues to be so extensively used. These virtues include providing results in the form of a dichotomous (yes/no) hypothesis evaluation and providing an index (p value) that has a justifiable mapping onto confidence in repeatability of a null hypothesis rejection. The most-criticized flaws of NHT can be avoided when the importance of a hypothesis, rather than the p value of its test, is used to determine that a finding is worthy of report, and when p approximately equal to .05 is treated as insufficient basis for confidence in the replicability of an isolated non-null finding. Together with many recent critics of NHT, we also urge reporting of important hypothesis tests in enough descriptive detail to permit secondary uses such as meta-analysis.

[1]  Jum C. Nunnally,et al.  The Place of Statistics in Psychology , 1960 .

[2]  W. W. Rozeboom The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[3]  D. Campbell,et al.  EXPERIMENTAL AND QUASI-EXPERIMENT Al DESIGNS FOR RESEARCH , 2012 .

[4]  A BINDER,et al.  Further considerations on testing the null hypothesis and the strategy and tactics of investigating theoretical models. , 1963, Psychological review.

[5]  W. Edwards,et al.  TACTICAL NOTE ON THE RELATION BETWEEN SCIENTIFIC AND STATISTICAL HYPOTHESES. , 1965, Psychological bulletin.

[6]  Alan Stuart,et al.  Data-Dredging Procedures in Survey Analysis , 1966 .

[7]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[8]  P. Meehl Theory-Testing in Psychology and Physics: A Methodological Paradox , 1967, Philosophy of Science.

[9]  D. Lykken Statistical significance in psychological research. , 1968, Psychological bulletin.

[10]  A. Greenwald Consequences of Prejudice Against the Null Hypothesis , 1975 .

[11]  R. P. Carver The Case Against Statistical Significance Testing , 1978 .

[12]  P. Meehl Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology. , 1978 .

[13]  Larry V. Hedges,et al.  How hard is hard science, how soft is soft science? The empirical cumulativeness of research. , 1987 .

[14]  K. Halmi,et al.  Comparison of bulimic and non-bulimic anorexia nervosa patients during treatment , 1987, Psychological Medicine.

[15]  S. Goodman,et al.  Evidence and scientific research. , 1988, American journal of public health.

[16]  G Gigerenzer,et al.  Hindsight bias: An interaction of automatic and motivational factors? , 1988, Memory & cognition.

[17]  R. Rosenthal,et al.  Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[18]  Jacob Cohen,et al.  THINGS I HAVE LEARNED (SO FAR) , 1990 .

[19]  A. Tellegen,et al.  Does contact lead to similarity or similarity to contact? , 1990, Behavior genetics.

[20]  R. Rosenthal Cumulating Psychology: An Appreciation of Donald T. Campbell , 1991 .

[21]  S. Peele The conflict between public health goals and the temperance mentality. , 1993, American journal of public health.

[22]  N. Ambady,et al.  Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. , 1993 .

[23]  A. Shaper Alcohol, the heart, and health. , 1993, American journal of public health.

[24]  Ronald C. Serlin,et al.  Rational appraisal of psychological research and the good-enough principle. , 1993 .

[25]  R. P. Carver The Case Against Statistical Significance Testing, Revisited , 1993 .

[26]  Jacob Cohen The earth is round (p < .05) , 1994 .

[27]  A. Greenwald,et al.  Activation by marginally perceptible ("subliminal") stimuli: dissociation of unconscious from conscious cognition. , 1995, Journal of experimental psychology. General.

[28]  R. Frick Accepting the null hypothesis , 1995, Memory & cognition.