Editor’s Note: Starting in January 2007, all manuscripts accepted for publication in ANESTHESIOLOGY are undergoing review by Timothy Houle, Ph.D., for statistical testing and reporting. There are two reasons for this additional review. As indicated in the following editorial, authors and readers often confuse P values with magnitude of effects, when it is oftentimes the latter that matters most. One goal for a universal statistical review is to remind authors to report and emphasize the magnitude of the effects they observe, in both the Results and Discussion sections, rather than restricting their comments to P values only. In addition, medical literature, including numerous reviews and original articles in ANESTHESIOLOGY, rely on combining results from separately published reports to reach consensus conclusions. A second goal for statistical review is to assure that statistical reporting is provided in a manner that facilitates this subsequent research. For the past several decades, there has been an escalating debate regarding the appropriate techniques to evaluate scientific hypotheses. In fact, the traditional method of significance testing is now being called into question. The discussion that follows is the first of several editorials over the coming months that will examine several possible methods used to report scientific findings. These examinations will do so by looking at the strengths and weaknesses of these methods and common reporting errors, and will begin with a comparison of interpreting P values versus effect size measures. My hope for these discussions is that they will lead to clear guidelines to statistical reporting that will eventually be incorporated into the Instructions for Authors in this journal and others in our specialty.
[1]
The Legend of the P Value
,
2005,
Anesthesia and analgesia.
[2]
Jacob Cohen,et al.
THINGS I HAVE LEARNED (SO FAR)
,
1990
.
[3]
F. Yates,et al.
Statistical methods for research workers. 5th edition
,
1935
.
[4]
B. Thompson.
Two and One‐Half Decades of Leadership in Measurement and Evaluation
,
1992
.
[5]
Poulton Ec,et al.
Unwanted asymmetrical transfer effects with balanced experimental designs.
,
1966
.
[6]
D. Bakan,et al.
The test of significance in psychological research.
,
1966,
Psychological bulletin.
[7]
F. Schmidt.
Statistical Significance Testing and Cumulative Knowledge in Psychology: Implications for Training of Researchers
,
1996
.
[8]
Jacob Cohen.
The earth is round (p < .05)
,
1994
.
[9]
R. Fisher.
Statistical methods for research workers
,
1927,
Protoplasma.
[10]
P. Lachenbruch.
Statistical Power Analysis for the Behavioral Sciences (2nd ed.)
,
1989
.