Probability as certainty: Dichotomous thinking and the misuse ofp values

Significance testing is widely used and often criticized. The Task Force on Statistical Inference of the American Psychological Association (TFSI, APA; Wilkinson & TFSI, 1999) addressed the use of significance testing and made recommendations that were incorporated in the fifth edition of the APAPublication Manual (APA, 2001). They emphasized the interpretation of significance testing and the importance of reporting confidence intervals and effect sizes. We examined whether 286Psychonomic Bulletin & Review articles submitted before and after the publication of the TFSI recommendations by APA complied with these recommendations. Interpretation errors when using significance testing were still made frequently, and the new prescriptions were not yet followed on a large scale. Changing the practice of reporting statistics seems doomed to be a slow process.

[1]  G. Cumming,et al.  Past and Future American Psychological Association Guidelines for Statistical Practice , 2002 .

[2]  R. Rosenthal,et al.  Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[3]  David Weisburd,et al.  When can we Conclude that Treatments or Programs “Don’t Work”? , 2003 .

[4]  Leland Wilkinson,et al.  Statistical Methods in Psychology Journals Guidelines and Explanations , 2005 .

[5]  M. Oakes Statistical Inference: A Commentary for the Social and Behavioural Sciences , 1986 .

[6]  Neil Thomason,et al.  Reporting of statistical inference in the Journal of Applied Psychology : Little evidence of reform. , 2001 .

[7]  W. W. Rozeboom The fallacy of the null-hypothesis significance test. , 1960, Psychological bulletin.

[8]  D. Bakan,et al.  The test of significance in psychological research. , 1966, Psychological bulletin.

[9]  Robert Rosenthal,et al.  The Interpretation of Levels of Significance by Psychological Researchers , 1963 .

[10]  W. Dunlap,et al.  On the Logic and Purpose of Significance Testing , 1997 .

[11]  F. Schmidt Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers , 1996 .

[12]  H. J. Arnold Introduction to the Practice of Statistics , 1990 .

[13]  W. Tryon Evaluating statistical difference, equivalence, and indeterminacy using inferential confidence intervals: an integrated alternative method of conducting null hypothesis statistical tests. , 2001, Psychological methods.

[14]  Carmen Batanero,et al.  Controversies Around the Role of Statistical Tests in Experimental Research , 2000 .

[15]  R. Falk,et al.  Significance Tests Die Hard , 1995 .

[16]  M. Masson Using confidence intervals for graphically based data interpretation. , 2003, Canadian journal of experimental psychology = Revue canadienne de psychologie experimentale.

[17]  Jacques Poitevineau,et al.  Even statisticians are not immune to misinterpretations of Null Hypothesis Significance Tests , 2003 .

[18]  Neil Thomason,et al.  Colloquium on Effect Sizes: the Roles of Editors, Textbook Authors, and the Publication Manual , 2001 .

[19]  Tammi Vacha-Haase,et al.  Statistical Significance should not be Considered one of Life’s Guarantees: Effect Sizes are Needed , 2001 .

[20]  Jacob Cohen The earth is round (p < .05) , 1994 .

[21]  L. Harlow,et al.  What if there were no significance tests , 1997 .

[22]  Cynthia Lum,et al.  "Don't Work"? , 2003 .