Performance Analysis of Classification Algorithms on Birth Dataset

Generating intuitions from data using data mining and machine learning algorithms to predict outcomes is useful area of computing. The application area of data mining techniques and machine learning is wide ranging including industries, healthcare, organizations, academics etc. A continuous improvement is witnessed due to an ongoing research, as seen particularly in healthcare. Several researchers have applied machine learning to develop decision support systems, perform analysis of dominant clinical factors, extraction of useful information from hideous patterns in historical data, making predictions and disease classification. Successful researches created opportunities for physicians to take appropriate decision at right time. In current study, we intend to utilize the learning capability of machine learning methods towards the classification of birth data using bagging and boosting classification algorithms. It is obvious that differences in living styles, medical assistances, religious implications and the region you live in collectively affect the residents of that society. This motive has encouraged the researchers to conduct studies at regional levels to comprehensively explore the associated medical factors that contribute towards complications among women during pregnancy. The current study is a comprehensive comparison of bagging and boosting classification algorithms performed on birth data collected from the government hospitals of city Muzaffarabad, Kashmir. The experimental tasks are carried out using caret package in R which is considered an inclusive framework for building machine learning models. Accuracy based results with different evaluation measures are presented. Bagging functions including Adabag and BagFda performed marginally better in terms of accuracy, precision and recall. Improvements are observed in comparison to previous study performed on same dataset.

[1]  V. Vapnik The Support Vector Method of Function Estimation , 1998 .

[2]  Paul Fergus,et al.  Machine learning ensemble modelling to classify caesarean section and vaginal delivery types using Cardiotocography traces , 2018, Comput. Biol. Medicine.

[3]  C. Metz Basic principles of ROC analysis. , 1978, Seminars in nuclear medicine.

[4]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[5]  Enayetur Raheem,et al.  Regional disparities in maternal and child health indicators: Cluster analysis of districts in Bangladesh , 2019, PloS one.

[6]  J J Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.

[7]  B. Milović,et al.  Prediction and Decision Making in Health Care using Data Mining , 2012 .

[8]  M.M. Van Dyne,et al.  Using machine learning and expert systems to predict preterm delivery in pregnant women , 1994, Proceedings of the Tenth Conference on Artificial Intelligence for Applications.

[9]  Lenka Lhotská,et al.  Discriminating Normal from "Abnormal" Pregnancy Cases Using an Automated FHR Evaluation Method , 2014, SETN.

[10]  Majaz Moonis,et al.  Stroke Subtype Clustering by Multifractal Bayesian Denoising with Fuzzy C Means and K-Means Algorithms , 2018, Complex..

[11]  V. Tilden,et al.  Life stress, social support, and emotional disequilibrium in complications of pregnancy: a prospective, multivariate study. , 1983, Journal of health and social behavior.

[12]  T. S. Indumathi,et al.  A Study on C.5 Decision Tree Classification Algorithm for Risk Predictions During Pregnancy , 2016 .

[13]  Rongwei Fu,et al.  Global neonatal and perinatal mortality: a review and case study for the Loreto Province of Peru , 2012 .

[14]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[15]  Di Xiao,et al.  Block mode image encryption technique using two-fold operations based on chaos, MD5 and DNA rules , 2018, Multimedia Tools and Applications.

[16]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[17]  Jennifer Abbasí To Prevent Cardiovascular Disease, Pay Attention to Pregnancy Complications. , 2018, JAMA.

[18]  Doina Precup,et al.  Classification of Normal and Hypoxic Fetuses From Systems Modeling of Intrapartum Cardiotocography , 2010, IEEE Transactions on Biomedical Engineering.

[19]  Muhammad Asif,et al.  MCD: Mutually Connected Community Detection using clustering coefficient approach in social networks , 2019, 2019 2nd International Conference on Communication, Computing and Digital systems (C-CODE).

[20]  J. Ludvigsson,et al.  Coeliac disease in the father affects the newborn , 2001, Gut.

[21]  อนิรุธ สืบสิงห์,et al.  Data Mining Practical Machine Learning Tools and Techniques , 2014 .

[22]  Mohamed El Bachir Menai,et al.  Influence of Feature Selection on Naïve Bayes Classifier for Recognizing Patterns in Cardiotocograms , 2013 .

[23]  J. Minogue,et al.  Factors contributing to the increased cesarean birth rate in older parturient women. , 1993, American journal of obstetrics and gynecology.

[24]  F. Guignard,et al.  A Gendered Bun in the Oven. The Gender-reveal Party as a New Ritualization during Pregnancy , 2015 .

[25]  J. Friedman Regularized Discriminant Analysis , 1989 .

[26]  Muhammad Asif,et al.  Cleft prediction before birth using deep neural network , 2020, Health Informatics J..

[27]  Chelsea Dobbins,et al.  Prediction of Preterm Deliveries from EHG Signals Using Machine Learning , 2013, PloS one.

[28]  Wen-Jyi Hwang,et al.  Fast kNN classification algorithm based on partial distance search , 1998 .

[29]  Rabia Riaz,et al.  Cause Analysis of Caesarian Sections and Application of Machine Learning Methods for Classification of Birth Data , 2018, IEEE Access.

[30]  Stuart R Lipsitz,et al.  Relationship Between Cesarean Delivery Rate and Maternal and Neonatal Mortality. , 2015, JAMA.

[31]  F. Malone,et al.  Impact of Maternal Age on Obstetric Outcome , 2005, Obstetrics and gynecology.

[32]  K MurthySreerama Automatic Construction of Decision Trees from Data , 1998 .

[33]  G. Gore,et al.  Cardiovascular Disease-Related Morbidity and Mortality in Women With a History of Pregnancy Complications: Systematic Review and Meta-Analysis , 2019, Circulation.

[34]  M. Kubát An Introduction to Machine Learning , 2017, Springer International Publishing.

[35]  Yoav Freund,et al.  Boosting a weak learning algorithm by majority , 1995, COLT '90.

[36]  Jalal Shah,et al.  Sentiment analysis of extremism in social media from textual information , 2020, Telematics Informatics.

[37]  B. Scholkopf,et al.  Fisher discriminant analysis with kernels , 1999, Neural Networks for Signal Processing IX: Proceedings of the 1999 IEEE Signal Processing Society Workshop (Cat. No.98TH8468).

[38]  M. Zweig,et al.  Receiver-operating characteristic (ROC) plots: a fundamental evaluation tool in clinical medicine. , 1993, Clinical chemistry.

[39]  Manuel Mazzara,et al.  A PLS-SEM Neural Network Approach for Understanding Cryptocurrency Adoption , 2020, IEEE Access.

[40]  安藤 寛,et al.  Cross-Validation , 1952, Encyclopedia of Machine Learning and Data Mining.

[41]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[42]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[43]  A. Chetrit,et al.  Effect of Very Advanced Maternal Age on Pregnancy Outcome and Rate of Cesarean Delivery , 1998, Obstetrics and gynecology.

[44]  Raul Robu The Analysis and Classification of Birth Data , 2015 .

[45]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[46]  Ahmed Alsayat,et al.  Efficient genetic K-Means clustering for health care knowledge discovery , 2016, 2016 IEEE 14th International Conference on Software Engineering Research, Management and Applications (SERA).