Using Machine Learning to Predict Laboratory Test Results.

OBJECTIVES While clinical laboratories report most test results as individual numbers, findings, or observations, clinical diagnosis usually relies on the results of multiple tests. Clinical decision support that integrates multiple elements of laboratory data could be highly useful in enhancing laboratory diagnosis. METHODS Using the analyte ferritin in a proof of concept, we extracted clinical laboratory data from patient testing and applied a variety of machine-learning algorithms to predict ferritin test results using the results from other tests. We compared predicted with measured results and reviewed selected cases to assess the clinical value of predicted ferritin. RESULTS We show that patient demographics and results of other laboratory tests can discriminate normal from abnormal ferritin results with a high degree of accuracy (area under the curve as high as 0.97, held-out test data). Case review indicated that predicted ferritin results may sometimes better reflect underlying iron status than measured ferritin. CONCLUSIONS These findings highlight the substantial informational redundancy present in patient test results and offer a potential foundation for a novel type of clinical decision support aimed at integrating, interpreting, and enhancing the diagnostic value of multianalyte sets of clinical laboratory test results.

[1]  G J Kuperman,et al.  A randomized trial of a computer-based intervention to reduce utilization of redundant laboratory tests. , 1999, The American journal of medicine.

[2]  Peter Bühlmann,et al.  MissForest - non-parametric missing value imputation for mixed-type data , 2011, Bioinform..

[3]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[4]  Amelia E. Huck,et al.  Utilization management in the clinical laboratory: an introduction and overview of the literature. , 2014, Clinica chimica acta; international journal of clinical chemistry.

[5]  Anand S Dighe,et al.  Enhanced creatinine and estimated glomerular filtration rate reporting to facilitate detection of acute kidney injury. , 2015, American journal of clinical pathology.

[6]  W. McIlroy,et al.  Laboratory diagnosis of iron-deficiency Anemia: An overview , 1992, Journal of General Internal Medicine.

[7]  O Jolobe,et al.  Guidelines for the management of iron deficiency anaemia , 2001, Gut.

[8]  J. Yager,et al.  Neurologic manifestations of iron deficiency in childhood. , 2002, Pediatric neurology.

[9]  Lucila Ohno-Machado,et al.  Generation of Knowledge for Clinical Decision Support , 2007 .

[10]  Gad Getz,et al.  Computational pathology: an emerging definition. , 2014, Archives of pathology & laboratory medicine.

[11]  David C. Chan,et al.  Improving safety and eliminating redundant tests: cutting costs in U.S. hospitals. , 2009, Health affairs.

[12]  W. McIlroy,et al.  Laboratory diagnosis of iron-deficiency anemia , 1992, Journal of General Internal Medicine.

[13]  Lucila Ohno-Machado,et al.  Generation of knowledge for clinical decision support: Statistical and machine learning techniques , 2014 .

[14]  Anand S Dighe,et al.  Detection of preanalytic laboratory testing errors using a statistically guided protocol. , 2012, American journal of clinical pathology.

[15]  J. Marrero,et al.  Comparison of imputation methods for missing laboratory data in medicine , 2013, BMJ Open.

[16]  Ramy Arnaout,et al.  The 2013 symposium on pathology data integration and clinical decision support and the current state of field , 2014, Journal of pathology informatics.

[17]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[18]  Stef van Buuren,et al.  MICE: Multivariate Imputation by Chained Equations in R , 2011 .