A fuzzy-rough nearest neighbor classifier combined with consistency-based subset evaluation and instance selection for automated diagnosis of breast cancer

A novel classification model based on fuzzy-rough nearest neighbor method.Fuzzy-rough instance selection.Consistency-based subset evaluation combined with re-ranking algorithm.The automated diagnosis of breast cancer with a classification accuracy of 99.71%. Breast cancer is one of the most common and deadly cancer for women. Early diagnosis and treatment of breast cancer can enhance the outcome of the patients. The development of classification models with high accuracy is an essential task in medical informatics. Machine learning algorithms have been widely employed to build robust and efficient classification models. In this paper, we present a hybrid intelligent classification model for breast cancer diagnosis. The proposed classification model consists of three phases: instance selection, feature selection and classification. In instance selection, the fuzzy-rough instance selection method based on weak gamma evaluator is utilized to remove useless or erroneous instances. In feature selection, the consistency-based feature selection method is used in conjunction with a re-ranking algorithm, owing to its efficiency in searching the possible enumerations in the search space. In the classification phase of the model, the fuzzy-rough nearest neighbor algorithm is utilized. Since this classifier does not require the optimal value for K neighbors and has richer class confidence values, this approach is utilized for the classification task. To test the efficacy of the proposed classification model we used the Wisconsin Breast Cancer Dataset (WBCD). The performance is evaluated using classification accuracy, sensitivity, specificity, F-measure, area under curve, and Kappa statistics. The obtained classification accuracy of 99.7151% is a very promising result compared to the existing works in this area reporting the results for the same data set.

[1]  Moshe Sipper,et al.  A fuzzy-genetic approach to breast cancer diagnosis , 1999, Artif. Intell. Medicine.

[2]  Saeid Nahavandi,et al.  Medical data classification using interval type-2 fuzzy logic system and wavelets , 2015, Appl. Soft Comput..

[3]  Marek Grochowski,et al.  Comparison of Instances Seletion Algorithms I. Algorithms Survey , 2004, ICAISC.

[4]  Changjing Shang,et al.  Fuzzy-rough feature selection aided support vector machines for Mars image classification , 2013, Comput. Vis. Image Underst..

[5]  Tingting Mu,et al.  Breast cancer detection from FNA using SVM with different parameter tuning systems and SOM-RBF classifier , 2007, J. Frankl. Inst..

[6]  Chee Keong Kwoh,et al.  A Feature Subset Selection Method Based On High-Dimensional Mutual Information , 2011, Entropy.

[7]  Kemal Polat,et al.  Breast cancer diagnosis using least square support vector machine , 2007, Digit. Signal Process..

[8]  Huan Liu,et al.  A Probabilistic Approach to Feature Selection - A Filter Solution , 1996, ICML.

[9]  Didier Dubois,et al.  Putting Rough Sets and Fuzzy Sets Together , 1992, Intelligent Decision Support.

[10]  Jinung An,et al.  Efficient classification system based on Fuzzy-Rough Feature Selection and Multitree Genetic Programming for intension pattern recognition using brain signal , 2015, Expert Syst. Appl..

[11]  Seral Özşen,et al.  Comparison of AIS and fuzzy c-means clustering methods on the classification of breast cancer and diabetes datasets , 2014 .

[12]  Yuehjen E. Shao,et al.  Mining the breast cancer pattern using artificial neural networks and multivariate adaptive regression splines , 2004, Expert Syst. Appl..

[13]  Chien-Hsing Chen,et al.  A hybrid intelligent model of analyzing clinical breast cancer data using clustering techniques with feature selection , 2014, Appl. Soft Comput..

[14]  A. Jemal,et al.  Breast Cancer Statistics , 2013 .

[15]  Mehmet Fatih Akay,et al.  Support vector machines combined with feature selection for breast cancer diagnosis , 2009, Expert Syst. Appl..

[16]  Sang Won Yoon,et al.  Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms , 2014, Expert Syst. Appl..

[17]  Kemal Polat,et al.  A new hybrid method based on fuzzy-artificial immune system and k-nn algorithm for breast cancer diagnosis , 2007, Comput. Biol. Medicine.

[18]  Harichandran Khanna Nehemiah,et al.  Knowledge Mining from Clinical Datasets Using Rough Sets and Backpropagation Neural Network , 2015, Comput. Math. Methods Medicine.

[19]  Gang Wang,et al.  An efficient diagnosis system for detection of Parkinson's disease using fuzzy k-nearest neighbor approach , 2013, Expert Syst. Appl..

[20]  Paul D. Gader,et al.  Detection and Discrimination of Land Mines in Ground-Penetrating Radar Based on Edge Histogram Descriptors and a Possibilistic $K$-Nearest Neighbor Classifier , 2009, IEEE Transactions on Fuzzy Systems.

[21]  Kotagiri Ramamohanarao,et al.  Breast-Cancer identification using HMM-fuzzy approach , 2010, Comput. Biol. Medicine.

[22]  James A. Rodger,et al.  A fuzzy nearest neighbor neural network statistical model for predicting demand for natural gas and energy cost savings in public buildings , 2014, Expert Syst. Appl..

[23]  Yu Peng,et al.  Quasiconformal kernel common locality discriminant analysis with application to breast cancer diagnosis , 2013, Inf. Sci..

[24]  Geoff Holmes,et al.  Benchmarking Attribute Selection Techniques for Discrete Class Data Mining , 2003, IEEE Trans. Knowl. Data Eng..

[25]  Vivian West,et al.  Computing, Artificial Intelligence and Information Technology Ensemble strategies for a medical diagnostic decision support system: A breast cancer diagnosis application , 2005 .

[26]  Varghese S. Jacob,et al.  Computing, Artificial Intelligence and Information Management Breast cancer prediction using the isotonic separation technique , 2007 .

[27]  M. A. Hayat Breast Cancer: An Introduction , 2008 .

[28]  Chris Cornelis,et al.  Fuzzy-rough instance selection , 2010, International Conference on Fuzzy Systems.

[29]  John Elder,et al.  Handbook of Statistical Analysis and Data Mining Applications , 2009 .

[30]  Qiang Shen,et al.  Computational Intelligence and Feature Selection - Rough and Fuzzy Approaches , 2008, IEEE Press series on computational intelligence.

[31]  Elif Derya íbeyli Implementing automated diagnostic systems for breast cancer detection , 2007 .

[32]  Geoffrey Holmes,et al.  Benchmarking attribute selection techniques for data mining , 2000 .

[33]  Dayou Liu,et al.  Design of an Enhanced Fuzzy k-nearest Neighbor Classifier Based Computer Aided Diagnostic System for Thyroid Disease , 2012, Journal of Medical Systems.

[34]  James A. Rodger,et al.  Application of a Fuzzy Feasibility Bayesian Probabilistic Estimation of supply chain backorder aging, unfilled backorders, and customer wait time using stochastic simulation with Markov blankets , 2014, Expert Syst. Appl..

[35]  Guoyin Wang,et al.  Rough Sets, Fuzzy Sets, Data Mining, and Granular Computing , 2013, Lecture Notes in Computer Science.

[36]  Nihat Yilmaz,et al.  A hybrid breast cancer detection system via neural network and feature selection based on SBS, SFS and PCA , 2012, Neural Computing and Applications.

[37]  Jose Miguel Puerta,et al.  Fast wrapper feature subset selection in high-dimensional datasets by means of filter re-ranking , 2012, Knowl. Based Syst..

[38]  Mohammad Saniee Abadeh,et al.  A fuzzy classification system based on Ant Colony Optimization for diabetes disease diagnosis , 2011, Expert Syst. Appl..

[39]  Richard Jensen,et al.  Fuzzy-Rough Data Mining , 2011, RSFDGrC.

[40]  L. Tabár,et al.  Breast cancer : the art and science of early detection with mammography : perception, interpretation, histopathologic correlation , 2005 .

[41]  Jianhua Dai,et al.  Rough set approach to incomplete numerical data , 2013, Inf. Sci..

[42]  Vandana,et al.  Survey of Nearest Neighbor Techniques , 2010, ArXiv.

[43]  Saeid Nahavandi,et al.  Classification of healthcare data using genetic fuzzy logic system and wavelets , 2015, Expert Syst. Appl..

[44]  Chee Peng Lim,et al.  A hybrid intelligent system for medical data classification , 2014, Expert Syst. Appl..

[45]  Wei-Zhi Wu,et al.  Constructive and axiomatic approaches of fuzzy approximation operators , 2004, Inf. Sci..

[46]  Dayou Liu,et al.  A support vector machine classifier with rough set-based feature selection for breast cancer diagnosis , 2011, Expert Syst. Appl..

[47]  Aruna Tiwari,et al.  Breast cancer diagnosis using Genetically Optimized Neural Network model , 2015, Expert Syst. Appl..

[48]  Joel Quintanilla-Domínguez,et al.  WBCD breast cancer database classification applying artificial metaplasticity neural network , 2011, Expert Syst. Appl..

[49]  M. Cevdet Ince,et al.  An expert system for detection of breast cancer based on association rules and neural network , 2009, Expert Syst. Appl..

[50]  Manish Sarkar,et al.  Fuzzy-rough nearest neighbor algorithms in classification , 2007, Fuzzy Sets Syst..