A proposal for interpreting and reporting negative studies.

An issue of continuing interest is the interpretation and reporting of 'negative' studies, namely studies that do not find statistically significant differences. The most common approach is the design-power method which determines, irrespective of the observed difference, what differences the study could have been expected to detect. We propose an alternative approach, the application of equivalence testing methods, where we define equivalence to mean that the actual difference lies within some specified limits. This approach, in contrast to the design-power approach, provides a way of quantifying (with p-values) what was actually determined from the study instead of saying what the study may or may not have accomplished with some degree of certainty (power). For example, a possible outcome of the equivalence testing approach is the conclusion at the 5 per cent level that two means (or proportions) do not differ by more than some specified amount. The equivalence testing approach applies to any study design. We illustrate the method with a cancer clinical trial and an epidemiologic case-control study. In addition, for those studies in which one cannot specify limits a priori, we propose the use of equivalence curves to summarize and present the study results.

[1]  T C Chalmers,et al.  The importance of beta, the type II error and sample size in the design and interpretation of the randomized control trial. Survey of 71 "negative" trials. , 1978, The New England journal of medicine.

[2]  P. O'Brien,et al.  A multiple testing procedure for clinical trials. , 1979, Biometrics.

[3]  R. Makuch,et al.  Illusion and reality: practical pitfalls in interpreting clinical trials. , 1984, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[4]  P. Meier,et al.  Statistics and medical experimentation. , 1975, Biometrics.

[5]  C. Dunnett,et al.  Significance testing to establish equivalence between treatments, with special reference to data in the form of 2X2 tables. , 1977, Biometrics.

[6]  M. Soloway,et al.  A comparison of cisplatin and the combination of cisplatin and cyclophosphamide in advanced urothelial cancer. A national bladder cancer collaborative group a study , 1983, Cancer.

[7]  D. Mandallaz,et al.  Comparison of different methods for decision-making in bioequivalence assessment. , 1981, Biometrics.

[8]  K C Cain,et al.  Charts for the early stopping of pilot studies. , 1984, Journal of clinical oncology : official journal of the American Society of Clinical Oncology.

[9]  B. Turnbull,et al.  Repeated confidence intervals for group sequential clinical trials. , 1984, Controlled clinical trials.

[10]  B. Rosner,et al.  Relation of serum vitamins A and E and carotenoids to the risk of cancer. , 1984, The New England journal of medicine.

[11]  W C Blackwelder,et al.  "Proving the null hypothesis" in clinical trials. , 1981, Controlled clinical trials.

[12]  W. J. Westlake,et al.  Use of confidence intervals in analysis of comparative bioavailability trials. , 1972, Journal of pharmaceutical sciences.

[13]  Walter W. Hauck,et al.  A new procedure for testing equivalence in comparative bioavailability and other clinical trials , 1983 .