Patient classification and outcome prediction in IgA nephropathy

OBJECTIVE IgA Nephropathy (IgAN) is a common kidney disease which may entail renal failure, known as End Stage Kidney Disease (ESKD). One of the major difficulties dealing with this disease is to predict the time of the long-term prognosis for a patient at the time of diagnosis. In fact, the progression of IgAN to ESKD depends on an intricate interrelationship between clinical and laboratory findings. Therefore, the objective of this work has been the selection of the best data mining tool to build a model able to predict (I) if a patient with a biopsy proven IgAN will reach ESKD and (II) if a patient will reach the ESKD before or after 5 years. MATERIAL AND METHODS The largest available cohort study worldwide on IgAN has been used to design and compare several data-driven models. The complete dataset was composed of 1174 records collected from Italian, Norwegian, and Japanese IgAN patients, in the last 30 years. The data mining tools considered in this work were artificial neural networks (ANNs), neuro fuzzy systems (NFSs), support vector machines (SVMs), and decision trees (DTs). A 10-fold cross validation was used to evaluate unbiased performances for all the models. RESULTS An extensive model comparison based on accuracy, precision, recall, and f-measure was provided. Overall, the results indicate that ANNs can provide superior performance compared to the other models. The ANN for time-to-ESKD prediction is characterized by accuracy, precision, recall, and f-measure greater than 90%. The ANN for ESKD prediction has accuracy greater than 90% as well as precision, recall, and f-measure for the class of patients not reaching ESKD, while precision, recall, and f-measure for the class of patients reaching ESKD are slightly lower. The obtained model has been implemented in a Web-based decision support system (DSS). CONCLUSIONS The extraction of novel knowledge from clinical data and the definition of predictive models to support diagnosis, prognosis, and therapy is becoming an essential tool for researchers and clinical practitioners in medicine. The proposed comparative study of several data mining models for the outcome prediction in IgAN patients, using a large dataset of clinical records from three different countries, provides an insight into the relative prediction ability of the considered methods applied to such a disease.

[1]  Marizan Sulaiman,et al.  Adaptive Neural Subtractive Clustering Fuzzy Inference System for the Detection of High Impedance Fault on Distribution Power System , 2012 .

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Tommaso Di Noia,et al.  An end stage kidney disease predictor based on an artificial neural networks ensemble , 2013, Expert Syst. Appl..

[4]  Mohammad Hossein Fazel Zarandi,et al.  Data-driven fuzzy modeling for Takagi-Sugeno-Kang fuzzy system , 2010, Inf. Sci..

[5]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[6]  Guoqiang Peter Zhang,et al.  Neural networks for classification: a survey , 2000, IEEE Trans. Syst. Man Cybern. Part C.

[7]  S. Barbour,et al.  Risk stratification of patients with IgA nephropathy. , 2012, American journal of kidney diseases : the official journal of the National Kidney Foundation.

[8]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[9]  David A. Landgrebe,et al.  A survey of decision tree classifier methodology , 1991, IEEE Trans. Syst. Man Cybern..

[10]  Jorge J. Moré,et al.  The Levenberg-Marquardt algo-rithm: Implementation and theory , 1977 .

[11]  K. Simpson,et al.  An artificial neural network can select patients at high risk of developing progressive IgA nephropathy more accurately than experienced nephrologists. , 1998, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[12]  Shian-Chang Huang,et al.  Evaluation of ANN and SVM classifiers as predictors to the diagnosis of students with learning disabilities , 2008, Expert Syst. Appl..

[13]  Blaz Zupan,et al.  Predictive data mining in clinical medicine: Current issues and guidelines , 2008, Int. J. Medical Informatics.

[14]  Giuseppe De Pietro,et al.  An evolutionary-fuzzy DSS for assessing health status in multiple sclerosis disease , 2011, Int. J. Medical Informatics.

[15]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[16]  Ian Witten,et al.  Data Mining , 2000 .

[17]  Heimar F. Marin,et al.  Artificial intelligence techniques applied to the development of a decision-support system for diagnosing celiac disease , 2011, Int. J. Medical Informatics.

[18]  Michele Rossini,et al.  A novel simpler histological classification for renal survival in IgA nephropathy: a retrospective study. , 2007, American journal of kidney diseases : the official journal of the National Kidney Foundation.

[19]  D. McPhee,et al.  Predicting cytomegalovirus disease after renal transplantation: an artificial neural network approach , 1999, Int. J. Medical Informatics.

[20]  Heekuck Oh,et al.  Neural Networks for Pattern Recognition , 1993, Adv. Comput..

[21]  J. Berger,et al.  [Intercapillary deposits of IgA-IgG]. , 1968, Journal d'urologie et de nephrologie.

[22]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[23]  Claire Cardie,et al.  Using Decision Trees to Improve Case-Based Learning , 1993, ICML.

[24]  César Hervás-Martínez,et al.  Predicting patient survival after liver transplantation using evolutionary multi-objective artificial neural networks , 2013, Artif. Intell. Medicine.

[25]  Indrajit Mandal,et al.  Accurate telemonitoring of Parkinson's disease diagnosis using robust inference system , 2013, Int. J. Medical Informatics.

[26]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[27]  Ilias Maglogiannis,et al.  An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers , 2009, Applied Intelligence.

[28]  Harris Drucker,et al.  Support vector machines for spam categorization , 1999, IEEE Trans. Neural Networks.

[29]  Jyh-Shing Roger Jang,et al.  ANFIS: adaptive-network-based fuzzy inference system , 1993, IEEE Trans. Syst. Man Cybern..

[30]  Serge Guillaume,et al.  Designing fuzzy inference systems from data: An interpretability-oriented review , 2001, IEEE Trans. Fuzzy Syst..

[31]  D. Cattran,et al.  Validation of the Toronto Formula to Predict Progression in IgA Nephropathy , 2008, Nephron Clinical Practice.

[32]  E. Bergstralh,et al.  Predicting renal outcome in IgA nephropathy. , 1997, Journal of the American Society of Nephrology : JASN.

[33]  Nada Lavrac,et al.  Selected techniques for data mining in medicine , 1999, Artif. Intell. Medicine.

[34]  D. Cattran,et al.  Predicting progression in IgA nephropathy. , 2001, American journal of kidney diseases : the official journal of the National Kidney Foundation.

[35]  Kurt Hornik,et al.  Approximation capabilities of multilayer feedforward networks , 1991, Neural Networks.

[36]  L. Thibaudin,et al.  Predicting the risk for dialysis or death in IgA nephropathy. , 2011, Journal of the American Society of Nephrology : JASN.

[37]  P. Finne,et al.  Factors associated with progression of IgA nephropathy are related to renal function--a model for estimating risk of progression in mild disease. , 2002, Clinical nephrology.

[38]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[39]  Masahiko Ando,et al.  A scoring system to predict renal outcome in IgA nephropathy: a nationwide 10-year prospective cohort study , 2009, Nephrology, dialysis, transplantation : official publication of the European Dialysis and Transplant Association - European Renal Association.

[40]  Giancarlo Ferrigno,et al.  Automatic classification of epilepsy types using ontology-based and genetics-based machine learning , 2014, Artif. Intell. Medicine.

[41]  Engin Avci,et al.  A new intelligent diagnosis system for the heart valve diseases by using genetic-SVM classifier , 2009, Expert Syst. Appl..

[42]  Karl Rihaczek,et al.  1. WHAT IS DATA MINING? , 2019, Data Mining for the Social Sciences.

[43]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.

[44]  Kurt Hornik,et al.  Universal approximation of an unknown mapping and its derivatives using multilayer feedforward networks , 1990, Neural Networks.

[45]  Roy Fielding,et al.  Architectural Styles and the Design of Network-based Software Architectures"; Doctoral dissertation , 2000 .

[46]  F. Berthoux,et al.  Prognostic factors in mesangial IgA glomerulonephritis: an extensive study with univariate and multivariate analyses. , 1991, American Journal of Kidney Diseases.

[47]  Anna Maria Fanelli,et al.  Assessment of semantic cointension of fuzzy rule-based classifiers in a medical context , 2011, 2011 11th International Conference on Intelligent Systems Design and Applications.

[48]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..