Supervised and unsupervised machine learning for improved identification of intrauterine growth restriction types

This paper concerns automated identification of intrauterine growth restriction (IUGR) types by use of machine learning methods. The research presents a comparison of supervised and unsupervised learning covering single and hybrid classification, as well as clustering. Supervised learning techniques included bagging with Naïve Bayes, k-nearest neighbours (kNN), C4.5 and SMO as base classifiers, random forest as a variant of bagging with a decision tree as a base classifier, boosting with Naïve Bayes, SMO, kNN and C4.5 as base classifiers, and voting by all single classifiers using majority as a combination rule, as well as five single classification strategies: kNN, C4.5, Naïve Bayes, random tree and sequential minimal optimization algorithm for training support vector machines. Unsupervised learning encompassed k-means and expectation-maximization algorithms. The major conclusion drawn from the study was that hybrid classifiers have demonstrated their potential ability to identify more accurately symmetrical and asymmetrical types of IUGR, whereas the unsupervised learning techniques produced the worst results.

[1]  Agnieszka Wosiak,et al.  Feature selection for classification incorporating less meaningful attributes in medical diagnostics , 2014, 2014 Federated Conference on Computer Science and Information Systems.

[2]  Kamalakar Karlapalem,et al.  An Experiment with Distance Measures for Clustering , 2008, COMAD.

[3]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[4]  J. Pezzullo,et al.  Intrauterine growth restriction in infants of less than thirty-two weeks' gestation: associated placental pathologic features. , 1995, American journal of obstetrics and gynecology.

[5]  Miguel Ángel Guevara-López,et al.  Improving the performance of machine learning classifiers for Breast Cancer diagnosis based on feature selection , 2014, 2014 Federated Conference on Computer Science and Information Systems.

[6]  Krzysztof Pytel,et al.  A fuzzy logic approach to the evaluation of health risks associated with obesity , 2013, 2013 Federated Conference on Computer Science and Information Systems.

[7]  Leo Breiman,et al.  Bagging Predictors , 1996, Machine Learning.

[8]  Fikret Gürgen,et al.  Intrauterine Growth Restriction (IUGR) Risk Decision Based on Support Vector Machines , 2010 .

[9]  Lior Rokach,et al.  Pattern Classification Using Ensemble Methods , 2009, Series in Machine Perception and Artificial Intelligence.

[10]  Emilio Corchado,et al.  A survey of multiple classifier systems as hybrid systems , 2014, Inf. Fusion.

[11]  A. D. de Winter,et al.  Symmetrical and Asymmetrical Growth Restriction in Preterm-Born Children , 2014, Pediatrics.

[12]  K. S. Shreedhara,et al.  Biometric measurement and classification of IUGR using neural networks , 2014, 2014 International Conference on Contemporary Computing and Informatics (IC3I).

[13]  S. Singh,et al.  Effect of Maternal Malnutrition and Anemia on the Endocrine Regulation of Fetal Growth , 2004, Endocrine research.

[14]  K. S. Shreedhara,et al.  Features Based IUGR Diagnosis Using Variational Level Set Method and Classification Using Artificial Neural Networks , 2014, 2014 Fifth International Conference on Signal and Image Processing.

[15]  S. Singh,et al.  Endocrine regulation in asymmetric intrauterine fetal growth retardation , 2006, The journal of maternal-fetal & neonatal medicine : the official journal of the European Association of Perinatal Medicine, the Federation of Asia and Oceania Perinatal Societies, the International Society of Perinatal Obstetricians.

[16]  A. Suresh,et al.  A NOVEL HYBRID MEDICAL DIAGNOSIS SYSTEM BASED ON GENETIC DATA ADAPTATION DECISION TREE AND CLUSTERING , 2015 .

[17]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[18]  Agnieszka Wosiak,et al.  Intra-uterine growth restriction as a risk factor for hypertension in children six to 10 years old , 2014, Cardiovascular journal of Africa.

[19]  P. S. Jeetha Lakshmi,et al.  Intelligent Medical Diagnosis System Using Weighted Genetic and New Weighted Fuzzy C-Means Clustering Algorithm , 2015 .

[20]  Jack Y. Yang,et al.  A comparative study of different machine learning methods on microarray gene expression data , 2008, BMC Genomics.

[21]  Shu-Lin Wang,et al.  Finding minimum gene subsets with heuristic breadth-first search algorithm for robust tumor classification , 2012, BMC Bioinformatics.

[22]  Lucyna Leniowska,et al.  Comparison of SVM and k-NN classifiers in the estimation of the state of the arteriovenous fistula problem , 2015, 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).

[23]  M. Friedman The Use of Ranks to Avoid the Assumption of Normality Implicit in the Analysis of Variance , 1937 .

[24]  Janez Demsar,et al.  Statistical Comparisons of Classifiers over Multiple Data Sets , 2006, J. Mach. Learn. Res..

[25]  G. Magenes,et al.  Detection of fetal distress though a support vector machine based on fetal heart rate parameters , 2005, Computers in Cardiology, 2005.

[26]  M. Friedman A Comparison of Alternative Tests of Significance for the Problem of $m$ Rankings , 1940 .

[27]  Zhongwei Jiang,et al.  Segmentation-based heart sound feature extraction combined with classifier models for a VSD diagnosis system , 2014, Expert Syst. Appl..

[28]  Thomas G. Dietterich Approximate Statistical Tests for Comparing Supervised Classification Learning Algorithms , 1998, Neural Computation.

[29]  D. Barker Maternal nutrition, fetal nutrition, and disease in later life. , 1997, Nutrition.

[30]  Giovanni Seni,et al.  Ensemble Methods in Data Mining: Improving Accuracy Through Combining Predictions , 2010, Ensemble Methods in Data Mining.

[31]  Hyun K Kim,et al.  Computer-aided diagnosis of rheumatoid arthritis with optical tomography, Part 2: image classification , 2013, Journal of biomedical optics.

[32]  Hussein Hijazi,et al.  A classification framework applied to cancer gene expression profiles. , 2013, Journal of healthcare engineering.

[33]  Wieslaw Paja Medical diagnosis support and accuracy improvement by application of total scoring from feature selection approach , 2015, 2015 Federated Conference on Computer Science and Information Systems (FedCSIS).

[34]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[35]  K. S. Shreedhara,et al.  MULTIPLE SONOGRAPHIC FEATURES BASED IUGR DIAGNOSIS USING ARTIFICIAL NEURAL NETWORKS , 2009 .

[36]  M. Widjaja,et al.  Fuzzy classifier of paddy growth stages based on synthetic MODIS data , 2012, 2012 International Conference on Advanced Computer Science and Information Systems (ICACSIS).

[37]  Maria G. Signorini,et al.  Complexity analysis of the fetal heart rate variability: early identification of severe intrauterine growth-restricted fetuses , 2009, Medical & Biological Engineering & Computing.

[38]  Ian H. Witten,et al.  Data mining: practical machine learning tools and techniques, 3rd Edition , 1999 .

[39]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[40]  Sotiris B. Kotsiantis,et al.  A Semisupervised Cascade Classification Algorithm , 2016, Appl. Comput. Intell. Soft Comput..

[41]  Francisco Herrera,et al.  Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: Experimental analysis of power , 2010, Inf. Sci..

[42]  R. Martorell,et al.  Maternal and child undernutrition and overweight in low-income and middle-income countries , 2013, The Lancet.