On the appropriate interpretation of evidence: the example of culture media and birth weight

The recent paper of Kleijkers et al. (2016) reported the results of a wellconducted randomized trial comparing two culture media systems and showed a significant difference in birth weight between embryos incubated in the two media. Like many results in assisted reproduction research it was likely to induce controversy given the emotive and commercial interests at stake. The subsequent debate in the pages of this journal (Kleijkers et al., 2017; Rieger, 2017; Sunde et al., 2017; Thompson et al., 2017) was of interest for its content and highlighted the urgent need for further research. It also highlighted deficiencies in how we as a community assess and interpret evidence from such trials. In this commentary we take that debate as an exemplar and look carefully at the evidence presented by Kleijkers et al. (2016) to highlight the areas where misleading interpretations have been suggested. This is not intended as a criticism of specific authors: the misinterpretations we identify are common across our literature and regularly encountered in the statistical review of papers submitted to this journal. However, this debate provides a rich source of collated examples. Although we comment on the previous articles and letters and refer the interested reader to those for more detail, this commentary is intended to stand alone and can be read without further reference to the original sources.

[1]  P. Vercellini,et al.  ‘Forever Young’†—Testosterone replacement therapy: a blockbuster drug despite flabby evidence and broken promises , 2017, Human reproduction.

[2]  D. Rieger All aspects of human ART must be considered, not just the embryo culture medium. , 2016, Human reproduction.

[3]  A. Wetzels,et al.  Reply II: Embryo culture media effects. , 2016, Human reproduction.

[4]  Sjoerd Repping,et al.  Influence of embryo culture medium (G5 and HTF) on pregnancy and perinatal outcome after IVF: a multicenter RCT. , 2016, Human reproduction.

[5]  S. Missmer,et al.  P-values and reproductive health: what can clinical researchers learn from the American Statistical Association? , 2016, Human reproduction.

[6]  M. Baker 1,500 scientists lift the lid on reproducibility , 2016, Nature.

[7]  J. Sterne,et al.  The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials , 2011, BMJ : British Medical Journal.

[8]  D. Altman,et al.  Outcome reporting bias in randomized trials funded by the Canadian Institutes of Health Research , 2004, Canadian Medical Association Journal.

[9]  P. Sandercock,et al.  Framework for design and evaluation of complex interventions to improve health , 2000, BMJ : British Medical Journal.

[10]  D G Altman,et al.  Statistics notes: Absence of evidence is not evidence of absence , 1995 .

[11]  L. Bolognese,et al.  RANDOMISED TRIAL OF INTRAVENOUS STREPTOKINASE, ORAL ASPIRIN, BOTH, OR NEITHER AMONG 17 187 CASES OF SUSPECTED ACUTE MYOCARDIAL INFARCTION: ISIS-2 , 1988, The Lancet.

[12]  D G Altman,et al.  Estimating with confidence , 1988, British medical journal.

[13]  J. Harper,et al.  Reply I: Embryo culture media effects. , 2017, Human reproduction.

[14]  S. Merhar,et al.  Letter to the editor , 2005, IEEE Communications Magazine.