Making complex prediction rules applicable for readers: Current practice in random forest literature and recommendations

Ideally, prediction rules should be published in such a way that readers may apply them, for example, to make predictions for their own data. While this is straightforward for simple prediction rules, such as those based on the logistic regression model, this is much more difficult for complex prediction rules derived by machine learning tools. We conducted a survey of articles reporting prediction rules that were constructed using the random forest algorithm and published in PLOS ONE in 2014-2015 in the field "medical and health sciences", with the aim of identifying issues related to their applicability. Making a prediction rule reproducible is a possible way to ensure that it is applicable; thus reproducibility is also examined in our survey. The presented prediction rules were applicable in only 2 of 30 identified papers, while for further eight prediction rules it was possible to obtain the necessary information by contacting the authors. Various problems, such as nonresponse of the authors, hampered the applicability of prediction rules in the other cases. Based on our experiences from this illustrative survey, we formulate a set of recommendations for authors who aim to make complex prediction rules applicable for readers. All data including the description of the considered studies and analysis codes are available as supplementary materials.

[1]  Rory Wilson,et al.  A measure of the impact of CV incompleteness on prediction error estimation with application to PCA and normalization , 2015, BMC Medical Research Methodology.

[2]  Ewout W Steyerberg,et al.  Risk prediction with machine learning and regression methods , 2014, Biometrical journal. Biometrische Zeitschrift.

[3]  Fan Yang,et al.  Reliable Multi-Label Learning via Conformal Predictor and Random Forest for Syndrome Differentiation of Chronic Fatigue in Traditional Chinese Medicine , 2014, PloS one.

[4]  Benjamin Hofner,et al.  Reproducible research in statistics: A review and guidelines for the Biometrical Journal , 2016, Biometrical journal. Biometrische Zeitschrift.

[5]  Dirk Merkel,et al.  Docker: lightweight Linux containers for consistent development and deployment , 2014 .

[6]  Maarten de Rijke,et al.  Share your Model instead of your Data: Privacy Preserving Mimic Learning for Ranking , 2017, ArXiv.

[7]  Graham J. Williams,et al.  PMML: An Open Standard for Sharing Models , 2009, R J..

[8]  Gary S Collins,et al.  Transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD): the TRIPOD Statement , 2015, BMC Medicine.

[9]  Hitinder S. Gurm,et al.  A Random Forest Based Risk Model for Reliable and Accurate Prediction of Receipt of Transfusion in Patients Undergoing Percutaneous Coronary Intervention , 2014, PloS one.

[10]  Galit Shmueli,et al.  To Explain or To Predict? , 2010, 1101.0891.

[11]  Ewout W Steyerberg,et al.  Modern modelling techniques are data hungry: a simulation study for predicting dichotomous endpoints , 2014, BMC Medical Research Methodology.

[12]  E. Steyerberg,et al.  Reporting and Methods in Clinical Prediction Research: A Systematic Review , 2012, PLoS medicine.

[13]  A. Dupuy,et al.  Critical review of published microarray studies for cancer outcome and guidelines on statistical analysis and reporting. , 2007, Journal of the National Cancer Institute.

[14]  R. Peng Reproducible Research in Computational Science , 2011, Science.

[15]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[16]  Michael P. Wellman,et al.  SoK: Security and Privacy in Machine Learning , 2018, 2018 IEEE European Symposium on Security and Privacy (EuroS&P).

[17]  M. Kohler,et al.  Probability estimation with machine learning methods for dichotomous and multicategory outcome: Theory , 2014, Biometrical journal. Biometrische Zeitschrift.

[18]  Anne-Laure Boulesteix,et al.  Machine learning versus statistical modeling , 2014, Biometrical journal. Biometrische Zeitschrift.

[19]  Jesse A. Berlin,et al.  Assessing the Generalizability of Prognostic Information , 1999 .

[20]  David Causeur,et al.  Improving cross‐study prediction through addon batch effect adjustment or addon normalization , 2016, Bioinform..

[21]  Christian Weimar,et al.  Probability estimation with machine learning methods for dichotomous and multicategory outcome: Applications , 2014, Biometrical journal. Biometrische Zeitschrift.

[22]  A. Boulesteix,et al.  Gram-negative and -positive bacteria differentiation in blood culture samples by headspace volatile compound analysis , 2016, Journal of Biological Research-Thessaloniki.

[23]  M. Pepe The Statistical Evaluation of Medical Tests for Classification and Prediction , 2003 .

[24]  Alexander Hapfelmeier,et al.  Mortality Risk for Acute Cholangitis (MAC): a risk prediction model for in-hospital mortality in patients with acute cholangitis , 2015, BMC Gastroenterology.

[25]  Leo Breiman,et al.  Statistical Modeling: The Two Cultures (with comments and a rejoinder by the author) , 2001 .

[26]  Vanessa Sochat,et al.  Singularity: Scientific containers for mobility of compute , 2017, PloS one.

[27]  David L Donoho,et al.  An invitation to reproducible computational research. , 2010, Biostatistics.