Making meaningful inferences about magnitudes.

A study of a sample provides only an estimate of the true (population) value of an outcome statistic. A report of the study therefore usually includes an inference about the true value. Traditionally, a researcher makes an inference by declaring the value of the statistic statistically significant or nonsignificant on the basis of a P value derived from a null-hypothesis test. This approach is confusing and can be misleading, depending on the magnitude of the statistic, error of measurement, and sample size. The authors use a more intuitive and practical approach based directly on uncertainty in the true value of the statistic. First they express the uncertainty as confidence limits, which define the likely range of the true value. They then deal with the real-world relevance of this uncertainty by taking into account values of the statistic that are substantial in some positive and negative sense, such as beneficial or harmful. If the likely range overlaps substantially positive and negative values, they infer that the outcome is unclear; otherwise, they infer that the true value has the magnitude of the observed value: substantially positive, trivial, or substantially negative. They refine this crude inference by stating qualitatively the likelihood that the true value will have the observed magnitude (eg, very likely beneficial). Quantitative or qualitative probabilities that the true value has the other 2 magnitudes or more finely graded magnitudes (such as trivial, small, moderate, and large) can also be estimated to guide a decision about the utility of the outcome.

[1]  R. Fisher Statistical methods for research workers , 1927, Protoplasma.

[2]  Jonathan A C Sterne,et al.  Sifting the evidence—what's wrong with significance tests? , 2001, BMJ : British Medical Journal.

[3]  W. Hopkins,et al.  COMBINING EXPLOSIVE AND HIGH‐RESISTANCE TRAINING IMPROVES PERFORMANCE IN COMPETITIVE CYCLISTS , 2005, Journal of strength and conditioning research.

[4]  D. Pyne,et al.  Characterising the individual performance responses to mild illness in international swimmers , 2005, British Journal of Sports Medicine.

[5]  D G Altman,et al.  Bayesians and frequentists , 1998, BMJ.

[6]  W. Hopkins A Spreadsheet for Analysis of Straightforward Controlled Trials , 2003 .

[7]  N Heddle,et al.  Basic statistics for clinicians: 2. Interpreting study results: confidence intervals. , 1995, CMAJ : Canadian Medical Association journal = journal de l'Association medicale canadienne.

[8]  R. Rosenthal,et al.  Statistical Procedures and the Justification of Knowledge in Psychological Science , 1989 .

[9]  L. Harlow,et al.  What if there were no significance tests , 1997 .

[10]  John Simes,et al.  Improving interpretation of clinical studies by use of confidence levels, clinical significance curves, and risk-benefit contours , 2001, The Lancet.

[11]  Leonard E. Brahman Confidence Intervals Assess Both Clinical Significance and Statistical Significance , 1991 .

[12]  W. Hopkins,et al.  Effects of modified-implement training on fast bowling in cricket , 2004, Journal of sports sciences.

[13]  R. J. Hamilton,et al.  Effect of high-intensity resistance training on performance of competitive distance runners. , 2006, International journal of sports physiology and performance.

[14]  D. Moher,et al.  The Revised CONSORT Statement for Reporting Randomized Trials: Explanation and Elaboration , 2001, Annals of Internal Medicine.

[15]  L. Braitman Confidence intervals assess both clinical significance and statistical significance. , 1991, Annals of internal medicine.

[16]  What is the chance that this study is clinically significant? A proposal for Q values. , 1999, Effective clinical practice : ECP.

[17]  Sung Gyoo Park Medicine and Science in Sports and Exercise , 1981 .

[18]  Roger E. Kirk,et al.  Promoting Good Statistical Practices: Some Suggestions , 2001 .

[19]  Jacob Cohen Statistical Power Analysis for the Behavioral Sciences , 1969, The SAGE Encyclopedia of Research Design.

[20]  W. Hopkins,et al.  Effects of ingestion of bicarbonate, citrate, lactate, and chloride on sprint running. , 2004, Medicine and science in sports and exercise.

[21]  Will G. Hopkins,et al.  A spreadsheet for deriving a confidence interval, mechanistic inference and clinical inference from a P value , 2007 .

[22]  C. Cook,et al.  Multiple effects of caffeine on simulated high-intensity team-sport performance. , 2005, Medicine and science in sports and exercise.

[23]  A statistics primer. Confidence intervals. , 1998, The American journal of sports medicine.

[24]  Sportscience,et al.  High-Resistance Interval Training Improves 40-km Time-Trial Performance in Competitive Cyclists , 2006 .

[25]  Jacob Cohen The earth is round (p < .05) , 1994 .

[26]  John W. Tukey,et al.  Statistical Methods for Research Workers , 1930, Nature.

[27]  W. Hopkins,et al.  Little effect of caffeine ingestion on repeated sprints in team-sport athletes. , 2001, Medicine and science in sports and exercise.