Predicting Chronic Kidney Failure Disease Using Data Mining Techniques

Kidney failure disease is being observed as a serious challenge to the medical field with its impact on a massive population of the world. Devoid of symptoms, kidney diseases are often identified too late when dialysis is needed urgently. Advanced data mining technologies can help provide alternatives to handle this situation by discovering hidden patterns and relationships in medical data. The objective of this research work is to predict kidney disease by using multiple machine learning algorithms that are Support Vector Machine (SVM), Multilayer Perceptron (MLP), Decision Tree (C4.5), Bayesian Network (BN) and K-Nearest Neighbour (K-NN). The aim of this work is to compare those algorithms and define the most efficient one(s) on the basis of multiple criteria. The database used is “Chronic Kidney Disease” implemented on the WEKA platform. From the experimental results, it is observed that MLP and C4.5 have the best rates. However, when compared with Receiver Operating Characteristic (ROC) curve, C4.5 appears to be the most efficient.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  V. Jha,et al.  Chronic kidney disease: global dimension and perspectives , 2013, The Lancet.

[3]  Sellappan Palaniappan,et al.  Intelligent heart disease prediction system using data mining techniques , 2008, 2008 IEEE/ACS International Conference on Computer Systems and Applications.

[4]  Anil K. Ghosh,et al.  On optimum choice of k , 2006, Comput. Stat. Data Anal..

[5]  Sultan Aljahdali,et al.  Comparative Prediction Performance with Support Vector Machine and Random Forest Classification Techniques , 2013 .

[6]  David Gur,et al.  On use of partial area under the ROC curve for evaluation of diagnostic performance , 2013, Statistics in medicine.

[7]  Guy Lapalme,et al.  A systematic analysis of performance measures for classification tasks , 2009, Inf. Process. Manag..

[8]  Neil R. Powe,et al.  Chronic kidney disease as a global public health problem: approaches and initiatives - a position statement from Kidney Disease Improving Global Outcomes. , 2007, Kidney international.

[9]  Liu Yin,et al.  Predicting breast cancer recurrence using data mining techniques , 2010, 2010 International Conference on Bioinformatics and Biomedical Technology.

[10]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[11]  Rashedur M. Rahman,et al.  Using and comparing different decision tree classification techniques for mining ICDDR, B Hospital Surveillance data , 2011, Expert Syst. Appl..

[12]  Illhoi Yoo,et al.  Data Mining in Healthcare and Biomedicine: A Survey of the Literature , 2012, Journal of Medical Systems.