An analysis of the quality of experimental design and reliability of results in tribology research

Abstract In recent years several high profile projects have questioned the repeatability and validity of scientific research in the fields of psychology and medicine. In general, these studies have shown or estimated that less than 50% of published research findings are true or replicable even when no breaches of ethics are made. This high percentage stems from widespread poor study design; either through the use of underpowered studies or designs that allow the introduction of bias into the results. In this work, we have aimed to assess, for the first time, the prevalence of good study design in the field of tribology. A set of simple criteria for factors such as randomisation, blinding, use of control and repeated tests has been made. These criteria have been used in a mass review of the output of five highly regarded tribology journals for the year 2017. In total 379 papers were reviewed by 26 reviewers, 28% of the total output of the journals selected for 2017. Our results show that the prevalence of these simple aspects of study design is poor. Out of 290 experimental studies, 2.2% used any form of blinding, 3.2% used randomisation of either the tests or the test samples, while none randomised both. 30% repeated experiments 3 or more times and 86% of those who repeated tests used single batches of test materials. 4.4% completed statistical tests on their data. Due to the low prevalence of repeated tests and statistical analysis it is impossible to give a realistic indication of the percentage of the published works that are likely to be false positives, however these results compare poorly to other more well studied fields. Finally, recommendations for improved study design for researchers and group design for research group leaders are given.

[1]  Leif D. Nelson,et al.  False-Positive Psychology , 2011, Psychological science.

[2]  Norbert Schmitz,et al.  Publication bias: what are the challenges and can they be overcome? , 2012, Journal of psychiatry & neuroscience : JPN.

[3]  V. Berger,et al.  A general framework for the evaluation of clinical trial quality. , 2009, Reviews on recent clinical trials.

[4]  Roger Lewis,et al.  Review of top of rail friction modifier tribology , 2016 .

[5]  R. Kaplan,et al.  Likelihood of Null Effects of Large NHLBI Clinical Trials Has Increased over Time , 2015, PloS one.

[6]  Jacob Cohen,et al.  A power primer. , 1992, Psychological bulletin.

[7]  P. Higgins,et al.  How to read a clinical trial paper: a lesson in basic trial statistics. , 2012, Gastroenterology & hepatology.

[8]  Michael C. Frank,et al.  Estimating the reproducibility of psychological science , 2015, Science.

[9]  S. Carley,et al.  An introduction to power and sample size estimation , 2003, Emergency medicine journal : EMJ.

[10]  Brian A. Nosek,et al.  Power failure: why small sample size undermines the reliability of neuroscience , 2013, Nature Reviews Neuroscience.

[11]  J. Ioannidis Why Most Discovered True Associations Are Inflated , 2008, Epidemiology.

[12]  Michael Bryce,et al.  Test 5.14.4. Deposit 18 June 15:43, embargoed 18/07/2019 : Article -> Review article , 2019 .

[13]  Jing Zheng,et al.  Surface properties of eroded human primary and permanent enamel and the possible remineralization influence of CPP-ACP , 2017 .

[14]  D. Moher,et al.  The CONSORT statement: revised recommendations for improving the quality of reports of parallel-group randomised trials , 2001, The Lancet.

[15]  B. Raeymaekers,et al.  Using a patterned microtexture to reduce polyethylene wear in metal-on-polyethylene prosthetic bearing couples. , 2017, Wear : an international journal on the science and technology of friction lubrication and wear.

[16]  M. Haselton,et al.  Meta-analyses and p-curves support robust cycle shifts in women's mate preferences: reply to Wood and Carden (2014) and Harris, Pashler, and Mickes (2014). , 2014, Psychological bulletin.

[17]  S. Tiwari,et al.  Optimization of High Stress Abrasive Wear of Polymer Blend Ethylene and Vinyl Acetate Copolymer/HDPE/MA-g-PE/OMMT Nanocomposites , 2017 .

[18]  Lisa Bero,et al.  Effect of reporting bias on meta-analyses of drug trials: reanalysis of meta-analyses , 2012, BMJ : British Medical Journal.

[19]  A. E. Martinelli,et al.  The effect of surface treatment on the friction and wear behavior of dental Y-TZP ceramic against human enamel , 2017 .

[20]  Peter Green,et al.  SIMR: an R package for power analysis of generalized linear mixed models by simulation , 2016 .

[21]  A. Gelman,et al.  The garden of forking paths : Why multiple comparisons can be a problem , even when there is no “ fishing expedition ” or “ p-hacking ” and the research hypothesis was posited ahead of time ∗ , 2019 .