A Random Forest based predictor for medical data classification using feature ranking

Abstract Medical data classification is considered to be a challenging task in the field of medical informatics. Although many works have been reported in the literature, there is still scope for improvement. In this paper, a feature ranking based approach is developed and implemented for medical data classification. The features of a dataset are ranked using some suitable ranker algorithms, and subsequently the Random Forest classifier is applied only on highly ranked features to construct the predictor. We have conducted extensive experiments on 10 benchmark datasets and the results are promising. We present highly accurate predictors for 10 different diseases, as well as suggest a methodology that is sufficiently general and is expected to perform well for other diseases with similar datasets.

[1]  Yves Leduc,et al.  Neural Network architecture for breast cancer detection and classification , 2016, 2016 IEEE International Multidisciplinary Conference on Engineering Technology (IMCET).

[2]  Chih-Fong Tsai,et al.  SVM and SVM Ensembles in Breast Cancer Prediction , 2017, PloS one.

[3]  Chee Peng Lim,et al.  A hybrid intelligent system for medical data classification , 2014, Expert Syst. Appl..

[4]  Hossam Faris,et al.  Improving Extreme Learning Machine by Competitive Swarm Optimization and its application for medical diagnosis problems , 2018, Expert Syst. Appl..

[5]  Jaber Alwidian,et al.  WCBA: Weighted classification based on association rules algorithm for breast cancer disease , 2018, Appl. Soft Comput..

[6]  R Kuppuchamy,et al.  A threshold fuzzy entropy based feature selection for medical database classification. , 2013, Computers in biology and medicine.

[7]  Huan Liu,et al.  Feature Selection for Classification , 1997, Intell. Data Anal..

[8]  Rajesh Kumar,et al.  A neural network based breast cancer prognosis model with PCA processed features , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[9]  Pei-Chann Chang,et al.  A hybrid model combining case-based reasoning and fuzzy decision tree for medical data classification , 2011, Appl. Soft Comput..

[10]  P. V. S. S. R. Chandra Mouli,et al.  Breast Cancer Classification Using Deep Neural Networks , 2018 .

[11]  Smaranda Belciug,et al.  Evolutionary strategy to develop learning-based decision systems. Application to breast cancer and liver fibrosis stadialization , 2014, J. Biomed. Informatics.

[12]  Mandana Rezaeiahari,et al.  AHP based Classification Algorithm Selection for Clinical Decision Support System Development , 2014, Complex Adaptive Systems.

[13]  G.G. Cano,et al.  An approach to cardiac arrhythmia analysis using hidden Markov models , 1990, IEEE Transactions on Biomedical Engineering.

[14]  Kemal Polat,et al.  A new feature selection method on classification of medical datasets: Kernel F-score feature selection , 2009, Expert Syst. Appl..

[15]  Aytug Onan,et al.  A fuzzy-rough nearest neighbor classifier combined with consistency-based subset evaluation and instance selection for automated diagnosis of breast cancer , 2015, Expert Syst. Appl..

[16]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[17]  Ayman M. Eldeib,et al.  Breast cancer classification using deep belief networks , 2016, Expert Syst. Appl..

[18]  Nilanjan Dey,et al.  Classification Approach for Breast Cancer Detection Using Back Propagation Neural Network: A Study , 2016 .

[19]  M. Cevdet Ince,et al.  An expert system for detection of breast cancer based on association rules and neural network , 2009, Expert Syst. Appl..

[20]  P. K. Anooj,et al.  Clinical decision support system: Risk level prediction of heart disease using weighted fuzzy rules , 2012, J. King Saud Univ. Comput. Inf. Sci..

[21]  Zhongyang Fei,et al.  A novel SVM-RFE based biomedical data processing approach: Basic and beyond , 2016, IECON 2016 - 42nd Annual Conference of the IEEE Industrial Electronics Society.

[22]  Hussein A. Abbass,et al.  An evolutionary artificial neural networks approach for breast cancer diagnosis , 2002, Artif. Intell. Medicine.

[23]  Ahmad Taher Azar,et al.  Performance analysis of support vector machines classifiers in breast cancer mammography recognition , 2013, Neural Computing and Applications.

[24]  Tulay Yildirim,et al.  BREAST CANCER DIAGNOSIS USING STATISTICAL NEURAL NETWORKS , 2004 .

[25]  P. K. Dash,et al.  An improved cuckoo search based extreme learning machine for medical data classification , 2015, Swarm Evol. Comput..

[26]  Jesús S. Aguilar-Ruiz,et al.  Fast Feature Ranking Algorithm , 2003, KES.

[27]  Yonghong Peng,et al.  A novel feature selection approach for biomedical data classification , 2010, J. Biomed. Informatics.

[28]  Bulusu Lakshmana Deekshatulu,et al.  Classification of Heart Disease Using K- Nearest Neighbor and Genetic Algorithm , 2015, ArXiv.

[29]  S. Muthukrishnan,et al.  AGFS: Adaptive Genetic Fuzzy System for medical data classification , 2014, Appl. Soft Comput..

[30]  Alexandre Mendes,et al.  Evolutionary Wavelet Neural Network ensembles for breast cancer and Parkinson’s disease prediction , 2018, PloS one.

[31]  F Chabat,et al.  Computerized decision support in medical imaging. , 2000, IEEE engineering in medicine and biology magazine : the quarterly magazine of the Engineering in Medicine & Biology Society.