Using artificial intelligence to reduce diagnostic workload without compromising detection of urinary tract infections

BackgroundA substantial proportion of microbiological screening in diagnostic laboratories is due to suspected urinary tract infections (UTIs), yet approximately two thirds of urine samples typically yield negative culture results. By reducing the number of query samples to be cultured and enabling diagnostic services to concentrate on those in which there are true microbial infections, a significant improvement in efficiency of the service is possible.MethodologyScreening process for urine samples prior to culture was modelled in a single clinical microbiology laboratory covering three hospitals and community services across Bristol and Bath, UK. Retrospective analysis of all urine microscopy, culture, and sensitivity reports over one year was used to compare two methods of classification: a heuristic model using a combination of white blood cell count and bacterial count, and a machine learning approach testing three algorithms (Random Forest, Neural Network, Extreme Gradient Boosting) whilst factoring in independent variables including demographics, historical urine culture results, and clinical details provided with the specimen.ResultsA total of 212,554 urine reports were analysed. Initial findings demonstrated the potential for using machine learning algorithms, which outperformed the heuristic model in terms of relative workload reduction achieved at a classification sensitivity > 95%. Upon further analysis of classification sensitivity of subpopulations, we concluded that samples from pregnant patients and children (age 11 or younger) require independent evaluation. First the removal of pregnant patients and children from the classification process was investigated but this diminished the workload reduction achieved. The optimal solution was found to be three Extreme Gradient Boosting algorithms, trained independently for the classification of pregnant patients, children, and then all other patients. When combined, this system granted a relative workload reduction of 41% and a sensitivity of 95% for each of the stratified patient groups.ConclusionBased on the considerable time and cost savings achieved, without compromising the diagnostic performance, the heuristic model was successfully implemented in routine clinical practice in the diagnostic laboratory at Severn Pathology, Bristol. Our work shows the potential application of supervised machine learning models in improving service efficiency at a time when demand often surpasses resources of public healthcare providers.

[1]  Betsy Foxman,et al.  Epidemiology of urinary tract infections: transmission and risk factors, incidence, and costs. , 2003, Infectious disease clinics of North America.

[2]  R. Tanasescu,et al.  Urinary tract infections in multiple sclerosis: under-diagnosed and under-treated? A clinical audit at a large University Hospital. , 2014, American journal of clinical and experimental immunology.

[3]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[4]  P. Brambilla,et al.  Bacteriuria Screening by Automated Whole-Field-Image-Based Microscopy Reduces the Number of Necessary Urine Cultures , 2011, Journal of Clinical Microbiology.

[5]  B. Kalal,et al.  Urinary tract infections: a retrospective, descriptive study of causative organisms and antimicrobial pattern of samples received for culture, from a tertiary care setting. , 2016, Germs.

[6]  M. Broeren,et al.  Screening for Urinary Tract Infection with the Sysmex UF-1000i Urine Flow Cytometer , 2011, Journal of Clinical Microbiology.

[7]  V. Ausina,et al.  Evaluation of the SediMax automated microscopy sediment analyzer and the Sysmex UF-1000i flow cytometer as screening tools to rule out negative urinary tract infections. , 2016, Clinica chimica acta; international journal of clinical chemistry.

[8]  F. Smaill,et al.  Asymptomatic bacteriuria and symptomatic urinary tract infections in pregnancy , 2008, European journal of clinical investigation.

[9]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL 2006.

[10]  Nico T. Mutters,et al.  Performance of Kiestra Total Laboratory Automation Combined with MS in Clinical Microbiology Practice , 2014, Annals of laboratory medicine.

[11]  Kei-Hoi Cheung,et al.  Predicting urinary tract infections in the emergency department with machine learning , 2018, PloS one.

[12]  M. Rocchi,et al.  Diagnosis of Bacteriuria and Leukocyturia by Automated Flow Cytometry Compared with Urine Culture , 2010, Journal of Clinical Microbiology.

[13]  Hilde van der Togt,et al.  Publisher's Note , 2003, J. Netw. Comput. Appl..

[14]  Hadley Wickham,et al.  ggplot2 - Elegant Graphics for Data Analysis (2nd Edition) , 2017 .

[15]  F. Vandenesch,et al.  Does bacteriology laboratory automation reduce time to results and increase quality management? , 2016, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[16]  Michael L. Waskom,et al.  seaborn: v0.5.0 (November 2014) , 2014 .

[17]  P. Bossuyt,et al.  Maternal and neonatal consequences of treated and untreated asymptomatic bacteriuria in pregnancy: a prospective cohort study with an embedded randomised controlled trial. , 2015, The Lancet. Infectious diseases.

[18]  N. Arents,et al.  Urine flow cytometry as a primary screening method to exclude urinary tract infections , 2013, World Journal of Urology.

[19]  Atsuyoshi Nakamura,et al.  On Practical Accuracy of Edit Distance Approximation Algorithms , 2017, ArXiv.

[20]  Sebastian Raschka,et al.  MLxtend: Providing machine learning and data science utilities and extensions to Python's scientific computing stack , 2018, J. Open Source Softw..

[21]  A. Morris,et al.  Predicting urine culture results by dipstick testing and phase contrast microscopy , 2003, Pathology.

[22]  Senén Barro,et al.  Do we need hundreds of classifiers to solve real world classification problems? , 2014, J. Mach. Learn. Res..

[23]  Bhargav Srinivasa Desikan,et al.  Natural Language Processing and Computational Linguistics , 2018 .

[24]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[25]  Wes McKinney,et al.  Data Structures for Statistical Computing in Python , 2010, SciPy.

[26]  H. Sarkkinen,et al.  Screening of Urine Samples by Flow Cytometry Reduces the Need for Culture , 2010, Journal of Clinical Microbiology.

[27]  Michael L. Waskom,et al.  mwaskom/seaborn: v0.9.0 (July 2018) , 2018 .

[28]  Sebastian Raschka,et al.  Model Evaluation, Model Selection, and Algorithm Selection in Machine Learning , 2018, ArXiv.

[29]  D. Enoch,et al.  Screening urine samples for the absence of urinary tract infection using the sediMAX automated microscopy analyser. , 2015, Journal of medical microbiology.

[30]  Tianqi Chen,et al.  XGBoost: A Scalable Tree Boosting System , 2016, KDD.

[31]  Kavishwar B. Wagholikar,et al.  Medical subdomain classification of clinical notes using a machine learning-based natural language processing approach , 2017, BMC Medical Informatics and Decision Making.

[32]  C. Cobbaert,et al.  Use of Automated Urine Microscopy Analysis in Clinical Diagnosis of Urinary Tract Infection: Defining an Optimal Diagnostic Score in an Academic Medical Center Population , 2018, Journal of Clinical Microbiology.

[33]  N. Lightfoot,et al.  Validation of a method for the rapid diagnosis of urinary tract infection suitable for use in general practice. , 1990, The British journal of general practice : the journal of the Royal College of General Practitioners.