Data fabrication and other reasons for non‐random sampling in 5087 randomised, controlled trials in anaesthetic and general medical journals

Randomised, controlled trials have been retracted after publication because of data fabrication and inadequate ethical approval. Fabricated data have included baseline variables, for instance, age, height or weight. Statistical tests can determine the probability of the distribution of means, given their standard deviation and the number of participants in each group. Randomised, controlled trials have been retracted after the data distributions have been calculated as improbable. Most retracted trials have been written by anaesthetists and published by specialist anaesthetic journals. I wanted to explore whether the distribution of baseline data in trials was consistent with the expected distribution. I wanted to determine whether trials retracted after publication had distributions different to trials that have not been retracted. I wanted to determine whether data distributions in trials published in specialist anaesthetic journals have been different to distributions in non‐specialist medical journals. I analysed the distribution of 72,261 means of 29,789 variables in 5087 randomised, controlled trials published in eight journals between January 2000 and December 2015: Anaesthesia (399); Anesthesia and Analgesia (1288); Anesthesiology (541); British Journal of Anaesthesia (618); Canadian Journal of Anesthesia (384); European Journal of Anaesthesiology (404); Journal of the American Medical Association (518) and New England Journal of Medicine (935). I chose these journals as I had electronic access to the full text. Trial p values were distorted by an excess of baseline means that were similar and an excess that were dissimilar: 763/5015 (15.2%) trials that had not been retracted from publication had p values that were within 0.05 of 0 or 1 (expected 10%), that is, a 5.2% excess, p = 1.2 × 10−7. The p values of 31/72 (43%) trials that had been retracted after publication were within 0.05 of 0 or 1, a rate different to that for unretracted trials, p = 1.03 × 10−10. The difference between the distributions of these two subgroups was confirmed by comparison of their overall distributions, p = 5.3 × 10−15. Each journal exhibited the same abnormal distribution of baseline means. There was no difference in distributions of baseline means for 1453 trials in non‐anaesthetic journals and 3634 trials in anaesthetic journals, p = 0.30. The rate of retractions from JAMA and NEJM, 6/1453 or 1 in 242, was one‐quarter the rate from the six anaesthetic journals, 66/3634 or 1 in 55, relative risk (99%CI) 0.23 (0.08–0.68), p = 0.00022. A probability threshold of 1 in 10,000 identified 8/72 (11%) retracted trials (7 by Fujii et al.) and 82/5015 (1.6%) unretracted trials. Some p values were so extreme that the baseline data could not be correct: for instance, for 43/5015 unretracted trials the probability was less than 1 in 1015 (equivalent to one drop of water in 20,000 Olympic‐sized swimming pools). A probability threshold of 1 in 100 for two or more trials by the same author identified three authors of retracted trials (Boldt, Fujii and Reuben) and 21 first or corresponding authors of 65 unretracted trials. Fraud, unintentional error, correlation, stratified allocation and poor methodology might have contributed to the excess of randomised, controlled trials with similar or dissimilar means, a pattern that was common to all the surveyed journals. It is likely that this work will lead to the identification, correction and retraction of hitherto unretracted randomised, controlled trials.