Diagnosis of Methylmalonic Acidemia using Machine Learning Methods

Methylmalonic acidemia (MMA) is an autosomal recessive metabolic disorder. Traditional diagnosis needs physicians' personal level of professional medical knowledge and clinical experience. In this paper, we employ machine learning methods to diagnose MMA based on patients' laboratory blood tests and laboratory urine tests, in order to make a timely diagnosis and reduce dependence on physicians' personal level of professional medical knowledge and clinical experience. By comparing different machine learning algorithms for diagnosing MMA, we obtain the following conclusions: (a) machine learning methods can perform well for diagnosing MMA (all established predictive models obtain high accuracies and AUC values which are greater than 0.85 over all data sets, and some of these results are even more than 0.98); (b) random forest algorithm performs best among the compared algorithms; and (c) diagnosis based on the data combining both urine tests and blood tests is better than diagnosis based on single test alone in general. The conclusions show that applying machine learning algorithms to the diagnosis of MMA can achieve good performance. Thus, it is credible to build machine learning models to give an initial diagnosis without professional medical knowledge.

[1]  Qiang Sun,et al.  Predictors of survival in children with methymalonic acidemia with homocystinuria in Beijing, China: A prospective cohort study , 2015, Indian Pediatrics.

[2]  Mark W. Schmidt,et al.  Minimizing finite sums with the stochastic average gradient , 2013, Mathematical Programming.

[3]  Tom Fawcett,et al.  An introduction to ROC analysis , 2006, Pattern Recognit. Lett..

[4]  Michael I. Jordan,et al.  Machine learning: Trends, perspectives, and prospects , 2015, Science.

[5]  Yang Xinying,et al.  Predictors of survival in children with methymalonic acidemia with homocystinuria in Beijing, China: a prospective cohort study. , 2015, Indian pediatrics.

[6]  Roland Eils,et al.  Quantitative diagnosis of breast tumors by morphometric classification of microenvironmental myoepithelial cells using a machine learning approach , 2017, Scientific Reports.

[7]  Francis Bach,et al.  SAGA: A Fast Incremental Gradient Method With Support for Non-Strongly Convex Composite Objectives , 2014, NIPS.

[8]  Wei-peng Wang,et al.  Newborn screening for inborn errors of metabolism in mainland china: 30 years of experience. , 2012, JIMD reports.

[9]  Sebastian Thrun,et al.  Dermatologist-level classification of skin cancer with deep neural networks , 2017, Nature.

[10]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[11]  Peter Szolovits,et al.  Using Machine Learning to Predict Laboratory Test Results. , 2016, American journal of clinical pathology.

[12]  Yue Jiang,et al.  Techniques for evaluating fault prediction models , 2008, Empirical Software Engineering.

[13]  Igor Kononenko,et al.  Modern parameterization and explanation techniques in diagnostic decision support system: A case study in diagnostics of coronary artery disease , 2011, Artif. Intell. Medicine.

[14]  Thomas P. Mechtler,et al.  The National Austrian Newborn Screening Program – Eight years experience with mass spectrometry. Past, present, and future goals , 2010, Wiener klinische Wochenschrift.

[15]  Matjaž Kukar,et al.  An application of machine learning to haematological diagnosis , 2017, Scientific Reports.

[16]  Robert P. Sheridan,et al.  Random Forest: A Classification and Regression Tool for Compound Classification and QSAR Modeling , 2003, J. Chem. Inf. Comput. Sci..

[17]  Anita MacDonald,et al.  Proposed guidelines for the diagnosis and management of methylmalonic and propionic acidemia , 2014, Orphanet Journal of Rare Diseases.

[18]  Quoc V. Le,et al.  On optimization methods for deep learning , 2011, ICML.

[19]  D. Goh,et al.  Inborn Error of Metabolism (IEM) screening in Singapore by electrospray ionization-tandem mass spectrometry (ESI/MS/MS): An 8 year journey from pilot to current program. , 2014, Molecular genetics and metabolism.

[20]  G. Cawley,et al.  Efficient approximate leave-one-out cross-validation for kernel logistic regression , 2008, Machine Learning.

[21]  Saeed Talebi,et al.  Methylmalonic Acidemia Diagnosis by Laboratory Methods. , 2016, Reports of biochemistry & molecular biology.

[22]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[23]  Matjaž Kukar,et al.  Application of machine learning for hematological diagnosis , 2017 .

[24]  Chih-Jen Lin,et al.  A Practical Guide to Support Vector Classication , 2008 .

[25]  Jinxiang Han,et al.  Methylmalonic acidemia: Current status and research priorities. , 2018, Intractable & rare diseases research.

[26]  Trevor Hastie,et al.  An Introduction to Statistical Learning , 2013, Springer Texts in Statistics.