A novel method for predicting kidney stone type using ensemble learning

The high morbidity rate associated with kidney stone disease, which is a silent killer, is one of the main concerns in healthcare systems all over the world. Advanced data mining techniques such as classification can help in the early prediction of this disease and reduce its incidence and associated costs. The objective of the present study is to derive a model for the early detection of the type of kidney stone and the most influential parameters with the aim of providing a decision-support system. Information was collected from 936 patients with nephrolithiasis at the kidney center of the Razi Hospital in Rasht from 2012 through 2016. The prepared dataset included 42 features. Data pre-processing was the first step toward extracting the relevant features. The collected data was analyzed with Weka software, and various data mining models were used to prepare a predictive model. Various data mining algorithms such as the Bayesian model, different types of Decision Trees, Artificial Neural Networks, and Rule-based classifiers were used in these models. We also proposed four models based on ensemble learning to improve the accuracy of each learning algorithm. In addition, a novel technique for combining individual classifiers in ensemble learning was proposed. In this technique, for each individual classifier, a weight is assigned based on our proposed genetic algorithm based method. The generated knowledge was evaluated using a 10-fold cross-validation technique based on standard measures. However, the assessment of each feature for building a predictive model was another significant challenge. The predictive strength of each feature for creating a reproducible outcome was also investigated. Regarding the applied models, parameters such as sex, acid uric condition, calcium level, hypertension, diabetes, nausea and vomiting, flank pain, and urinary tract infection (UTI) were the most vital parameters for predicting the chance of nephrolithiasis. The final ensemble-based model (with an accuracy of 97.1%) was a robust one and could be safely applied to future studies to predict the chances of developing nephrolithiasis. This model provides a novel way to study stone disease by deciphering the complex interaction among different biological variables, thus helping in an early identification and reduction in diagnosis time.

[1]  Mh Saraee,et al.  Disordered Metabolic Evaluation in Renal Stone Recurrence: A Data Mining Approach , 2011 .

[2]  Suman Bala,et al.  A Literature Review on Kidney Disease Prediction using Data Mining Classification Technique , 2014 .

[3]  Sai Prasad Potharaju,et al.  Ensembled Rule Based Classification Algorithms for predicting Imbalanced Kidney Disease Data , 2016 .

[4]  F. Tsai,et al.  Prediction of stone disease by discriminant analysis and artificial neural networks in genetic polymorphisms: a new method , 2003, BJU international.

[5]  Neslihan Demirel,et al.  ANALYSING INTERACTIONS OF RISK FACTORS ACCORDING TO RISK LEVELS FOR HEMODIALYSIS PATIENTS IN TURKEY: A DATA MINING APPLICATION , 2011 .

[6]  Chen Yang,et al.  A data mining approach to MPGN type II renal survival analysis , 2010, IHI.

[7]  Shahram Tofighi,et al.  Data Mining, an Approach for Developing the Health Domain , 2015 .

[8]  Sahar Bayat,et al.  Modelling access to renal transplantation waiting list in a French healthcare network using a Bayesian method , 2008, MIE.

[9]  Sai Prasad Potharaju,et al.  An Improved Prediction of Kidney Disease using SMOTE , 2016 .

[10]  Tommaso Di Noia,et al.  An end stage kidney disease predictor based on an artificial neural networks ensemble , 2013, Expert Syst. Appl..

[11]  Xudong Song,et al.  Study on Data Mining Technology and its Application for Renal Failure Hemodialysis Medical Field , 2012 .

[12]  Luigi Portinale,et al.  Assessing the Quality of Care for End Stage Renal Failure Patients by Means of Artificial Intelligence Methodologies , 2007, Advanced Computational Intelligence Paradigms in Healthcare.

[13]  Abhishek,et al.  Artificial Neural Networks for Diagnosis of Kidney Stones Disease , 2012 .

[14]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[15]  George T. Diderrich,et al.  A Note on Breiman's Random Forest Data Mining Technique and Conventional Cox Modeling of Survival Statistics: The Case of the Phantom “Induct” Covariate in the Ohio State University Kidney Transplant Database , 2007 .

[16]  Mohammad Mehdi Sepehri,et al.  Data Mining Performance in Identifying the Risk Factors of Early Arteriovenous Fistula Failure in Hemodialysis Patients , 2013 .

[17]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[18]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[19]  Tai-Hsi Wu,et al.  Using data mining techniques to predict hospitalization of hemodialysis patients , 2011, Decis. Support Syst..

[20]  Ugur Bilge,et al.  Artificial neural network, genetic algorithm, and logistic regression applications for predicting renal colic in emergency settings , 2009, International Journal of Emergency Medicine.

[21]  Mu-Yen Chen,et al.  Integrating data mining with case-based reasoning for chronic diseases prognosis and diagnosis , 2007, Expert Syst. Appl..

[22]  Andrew Kusiak,et al.  Predicting survival time for kidney dialysis patients: a data mining approach , 2005, Comput. Biol. Medicine.

[23]  Harleen Kaur,et al.  The impact of data mining techniques on medical diagnostics , 2006, Data Sci. J..