People at high-risk of cardiovascular disease are most likely vulnerable to chronic kidney diseases, and historical medical records can help avert complicated kidney problems. In this paper, 12 supervised machine learning algorithms were used to analyses a retrospective electronic medical data on chronic kidney disease. The study targeted 544 outpatients although 48 failed to meet the inclusion criteria and some other 21 cases had missing values and were excluded from the study. The profiling and the preliminaries result established that 88.5% of the cases were labeled as advance CKD while 11.5% were labelled as early-stage CKD cases. The classification task and the subsequent evaluation of the models were based on the correct classification of the two groups. Of the evaluated algorithms, decision tree boosted decision tree, and CN2 rule induction was the least accurate ones. However, logistic regression (Ridge and Lasso), neural network (logistic and stochastic gradient descent), and support vector machine (Radial Basis Function and Polynomial) had very high accuracies and efficiency. With an efficiency of 93.4% and a classification accuracy of 91.7%, Polynomial Support Vector Machine algorithm was the most efficient and accurate. The model suggested 253 2-dimensional combinations of factors with a history of vascular diseases and smoking as the most influential factors. The other combinations can provide information that can be used to predict or detect chronic kidney disease based on historical records. Future research prospects should consider using discretized Glomerular Filtration Rate to ensure that the classification integrates the five stages of the CKD.
[1]
S. Al-Shamsi,et al.
Chronic kidney disease in patients at high risk of cardiovascular disease in the United Arab Emirates: A population-based study
,
2018,
PloS one.
[2]
Divya Jain,et al.
Feature selection and classification systems for chronic disease prediction: A review
,
2018,
Egyptian Informatics Journal.
[3]
Poonam Sinha,et al.
Comparative Study of Chronic Kidney Disease Prediction using KNN and SVM
,
2015
.
[4]
José Manuel Moya,et al.
Modeling methodology for the accurate and prompt prediction of symptomatic events in chronic diseases
,
2016,
J. Biomed. Informatics.
[5]
Amir Ahmad,et al.
Decision tree ensembles based on kernel features
,
2014,
Applied Intelligence.
[6]
Peter Clark,et al.
The CN2 Induction Algorithm
,
1989,
Machine Learning.
[7]
Jeffrey J. P. Tsai,et al.
Machine learning applications in software engineering
,
2005
.