The Implementation of Z-Score Normalization and Boosting Techniques to Increase Accuracy of C4.5 Algorithm in Diagnosing Chronic Kidney Disease

In the health sector, data mining can be used as a recommendation to predict a disease from the  collection of patient medical record data or health data. One of the techniques can be applied is  classification with the C4.5 algorithm. The increasing accuracy can be conducted in data  transformation using zscore normalization method. In addition, the implementation of the  ensemble method can also improve accuracy of C4.5 algorithm, namely boosting or adaboost.  The purpose of this study was determinin the implementation of zscore normalization in the  pre-processing and adaboost stages of the C4.5 algorithm and determing the accuracy of the  C4.5 algorithm after applying zscore and adaboost normalization in diagnosing chronic kidney  disease. In this study, the mining process used k-fold cross validation with the default value k =  10. The implementation of the C4.5 algorithm obtained an accuracy of 96% while the accuracy  of the C4.5 algorithm with the zscore normalization method obtained an accuracy of 96.75%.  The highest accuracy was obtained from the addition of the boosting method to the C4.5  algorithm and zscore normalization obtained the accuracy of 97.25%. The increasing accuracy  was obtained of 1.25% which compared to the accuracy C4.5 algorithm.