Developing a Machine Learning System for Identification of Severe Hand, Foot, and Mouth Disease from Electronic Medical Record Data

Children of severe hand, foot, and mouth disease (HFMD) often present with same clinical features as those of mild HFMD during the early stage, yet later deteriorate rapidly with a fulminant disease course. Our goal was to: (1) develop a machine learning system to automatically identify cases with high risk of severe HFMD at the time of admission; (2) compare the effectiveness of the new system with the existing risk scoring system. Data on 2,532 HFMD children admitted between March 2012 and July 2015, were collected retrospectively from a medical center in China. By applying a holdout strategy and a 10-fold cross validation method, we developed four models with the random forest algorithm using different variable sets. The prediction system HFMD-RF based on the model of 16 variables from both the structured and unstructured data, achieved 0.824 sensitivity, 0.931 specificity, 0.916 accuracy, and 0.916 area under the curve in the independent test set. Most remarkably, HFMD-RF offers significant gains with respect to the commonly used pediatric critical illness score in clinical practice. As all the selected risk factors can be easily obtained, HFMD-RF might prove to be useful for reductions in mortality and complications of severe HFMD.

[1]  Zhi-Hua Zhou,et al.  Exploratory Undersampling for Class-Imbalance Learning , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[2]  Xiuhui Li,et al.  Elevated levels of circulating histones indicate disease activity in patients with hand, foot, and mouth disease (HFMD) , 2014, Scandinavian journal of infectious diseases.

[3]  Christopher. Simons,et al.  Machine learning with Python , 2017 .

[4]  Russ B. Altman,et al.  Missing value estimation methods for DNA microarrays , 2001, Bioinform..

[5]  T. Solomon,et al.  Clinical features, diagnosis, and management of enterovirus 71 , 2010, The Lancet Neurology.

[6]  A. Ling,et al.  Clinical characteristics of an outbreak of hand, foot and mouth disease in Singapore. , 2003, Annals of the Academy of Medicine, Singapore.

[7]  Xinchun Chen,et al.  Comparative Study of the Cytokine/Chemokine Response in Children with Differing Disease Severity in Enterovirus 71-Induced Hand, Foot, and Mouth Disease , 2013, PloS one.

[8]  Tzou-Yien Lin,et al.  Predictors of Unfavorable Outcomes in Enterovirus 71-Related Cardiopulmonary Failure in Children , 2005, The Pediatric infectious disease journal.

[9]  J. Wang,et al.  Association analysis of polymorphisms in OAS1 with susceptibility and severity of hand, foot and mouth disease , 2014, International journal of immunogenetics.

[10]  Constantin F. Aliferis,et al.  Medical decision support using machine learning for early detection of late-onset neonatal sepsis , 2014, J. Am. Medical Informatics Assoc..

[11]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[12]  S. Qian,et al.  The epidemiology of acute respiratory distress syndrome in pediatric intensive care units in China , 2008, Intensive Care Medicine.

[13]  D. Sontag,et al.  Comparison of Approaches for Heart Failure Case Identification From Electronic Health Record Data. , 2016, JAMA cardiology.

[14]  Wenbo Xu,et al.  Molecular Evidence of Persistent Epidemic and Evolution of Subgenotype B1 Coxsackievirus A16-Associated Hand, Foot, and Mouth Disease in China , 2009, Journal of Clinical Microbiology.

[15]  Wolfgang Gaul,et al.  "Classification, Clustering, and Data Mining Applications" , 2004 .

[16]  M. Eberl,et al.  Machine-learning algorithms define pathogen-specific local immune fingerprints in peritoneal dialysis patients with bacterial infections , 2017, Kidney international.

[17]  Ji-an Li,et al.  Genetic polymorphism of CCL2-2510 and susceptibility to enterovirus 71 encephalitis in a Chinese population , 2014, Archives of Virology.

[18]  Juan Luis Fernández-Martínez,et al.  From Bayes to Tarantola: New insights to understand uncertainty in inverse problems☆ , 2013 .

[19]  Shin-Ru Shih,et al.  Clinical features and risk factors of pulmonary oedema after enterovirus-71-related hand, foot, and mouth disease , 1999, The Lancet.

[20]  M. Weisse,et al.  A Recurrent Presentation of Hand, Foot, and Mouth Disease , 2006, Clinical pediatrics.

[21]  David Banks,et al.  Classification, clustering, and data mining applications : proceedings of the meeting of the International Federation of Classification Societies (IFCS), Illinois Institute of Technology, Chicago, 15-18 July 2004 , 2004 .

[22]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[23]  Subinoy Das,et al.  Meaningful Use of Electronic Health Records in Otolaryngology , 2011, Otolaryngology--head and neck surgery : official journal of American Academy of Otolaryngology-Head and Neck Surgery.

[24]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[25]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[26]  Hongxing Dang,et al.  Clinical Significance and Prognostic Effect of Serum 25-hydroxyvitamin D Concentrations in Critical and Severe Hand, Foot and Mouth Disease , 2017, Nutrients.

[27]  Munn Sann Lye,et al.  Deaths in children during an outbreak of hand, foot and mouth disease in Peninsular Malaysia--clinical and pathological characteristics. , 2005, The Medical journal of Malaysia.

[28]  D. Wall,et al.  Use of machine learning for behavioral distinction of autism and ADHD , 2016, Translational Psychiatry.

[29]  Huiying Liang,et al.  Machine learning applications for prediction of relapse in childhood acute lymphoblastic leukemia , 2017, Scientific Reports.

[30]  B. Zhu,et al.  Risk factors of severe hand, foot and mouth disease: A meta-analysis , 2014, Scandinavian journal of infectious diseases.

[31]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[32]  H. Lei,et al.  Cerebrospinal fluid cytokines in enterovirus 71 brain stem encephalitis and echovirus meningitis infections of varying severity. , 2007, Clinical microbiology and infection : the official publication of the European Society of Clinical Microbiology and Infectious Diseases.

[33]  L. Chang,et al.  Different proinflammatory reactions in fatal and non‐fatal enterovirus 71 infections: implications for early recognition and therapy , 2002, Acta paediatrica.

[34]  Jing Liu,et al.  Machine Learning Algorithms for Risk Prediction of Severe Hand-Foot-Mouth Disease in Children , 2017, Scientific Reports.

[35]  Hui Chen,et al.  Study on Risk Factors for Severe Hand, Foot and Mouth Disease in China , 2014, PloS one.

[36]  Zhi-Hua Zhou,et al.  Exploratory Under-Sampling for Class-Imbalance Learning , 2006, ICDM.

[37]  P. Pronovost,et al.  A targeted real-time early warning score (TREWScore) for septic shock , 2015, Science Translational Medicine.

[38]  D. S. Kim,et al.  Risk Factors for Neurologic Complications of Hand, Foot and Mouth Disease in the Republic of Korea, 2009 , 2013, Journal of Korean medical science.

[39]  Pingping Liu,et al.  Derivation and Validation of a Mortality Risk Score for Severe Hand, Foot and Mouth Disease in China , 2017, Scientific Reports.

[40]  Cécile Viboud,et al.  Hand, foot, and mouth disease in China, 2008-12: an epidemiological study. , 2014, The Lancet. Infectious diseases.

[41]  Brieuc Conan-Guez,et al.  Phoneme Discrimination with Functional Multi-Layer Perceptrons , 2004 .

[42]  T. Cikač,et al.  HAND-FOOT-AND-MOUTH-DISEASE (HFMD) , 2016 .

[43]  H. Sutton AN EPIDEMIOLOGICAL STUDY , 1937 .

[44]  F. Q. Ribeiro The meta-analysis , 2017, Brazilian journal of otorhinolaryngology.

[45]  Kenny Q. Zhu,et al.  Data-Driven Information Extraction from Chinese Electronic Medical Records , 2015, PloS one.

[46]  C. Zuo,et al.  [Role of Pediatric Critical Illness Score in evaluating severity and prognosis of severe hand-foot-mouth disease]. , 2015, Zhongguo dang dai er ke za zhi = Chinese journal of contemporary pediatrics.

[47]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[48]  Jos Boekhorst,et al.  Data mining in the Life Sciences with Random Forest: a walk in the park or lost in the jungle? , 2012, Briefings Bioinform..