Hybrid Multistage Fuzzy Clustering System for Medical Data Classification

Due to the rapid development in technology nowadays, massive amount of data are available. In medicine, decision making is entirely based on the hidden information in these massive data. For that reason, data mining and machine learning technologies provide powerful tools for knowledge discovery within data. Two main techniques are used interchangeably: clustering and classification. In machine learning, clustering is an unsupervised learning technique while classification is a supervised learning method. These techniques are capable of extracting useful patterns and information which aid the process of data analysis and clinical decisions. This research presents a recent study of these techniques in the medical field during the past five years. Moreover, this paper proposes a hybrid multistage fuzzy clustering system applied to medical data classification. In the proposed system, two fuzzy clustering algorithms specifically FCM and GK were initially employed to obtain the membership values. These weights are then used in the second stage of the system as additional informative features to improve the classification process completed by SVM algorithm. Wisconsin Breast Cancer dataset, real-world application, obtained from UCI were used in the experiments. The results of the experiments show that the additional weights further improve the classification accuracy with 99.06% and 100% sensitivity.

[1]  K. Usha Rani,et al.  ANALYSIS OF FEATURE SELECTION WITH CLASSFICATION: BREAST CANCER DATASETS , 2011 .

[2]  José Santos-Victor,et al.  A Vision-Based System for Movement Analysis in Medical Applications: The Example of Parkinson Disease , 2015, ICVS.

[3]  Rajesh N. Davé,et al.  Adaptive fuzzy c-shells clustering and detection of ellipses , 1992, IEEE Trans. Neural Networks.

[4]  Ron Kohavi,et al.  Scaling Up the Accuracy of Naive-Bayes Classifiers: A Decision-Tree Hybrid , 1996, KDD.

[5]  K. Usha Rani,et al.  ENSEMBLE DECISION TREE CLASSIFIER FOR BREAST CANCER DATA , 2012 .

[6]  R. Yager,et al.  Approximate Clustering Via the Mountain Method , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[7]  Igor Kononenko,et al.  Machine learning for medical diagnosis: history, state of the art and perspective , 2001, Artif. Intell. Medicine.

[8]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[9]  Murat Karabatak,et al.  A new classifier for breast cancer detection based on Naïve Bayesian , 2015 .

[10]  M. B. Abdelhalim,et al.  Breast Cancer Diagnosis on Three Different Datasets Using Multi-Classifiers , 2012 .

[11]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[12]  Angeline Christobel,et al.  An Empirical Comparison of Data Mining Classification Methods , 2011 .

[13]  L. V. Nandakishore,et al.  KNOWLEDGE BASED ANALYSIS OF VARIOUS STATISTICAL TOOLS IN DETECTING BREAST CANCER , 2011 .

[14]  Amit kumar Dewangan,et al.  A BRIEF SURVEY ON THE TECHNIQUES USED FOR THE DIAGNOSIS OF DIABETES-MELLITUS , 2015 .

[15]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[16]  Usman Qamar,et al.  Heterogeneous classifiers fusion for dynamic breast cancer diagnosis using weighted vote based ensemble , 2015 .

[17]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[18]  Subhash C. Bagui,et al.  Combining Pattern Classifiers: Methods and Algorithms , 2005, Technometrics.

[19]  Suiren Wan,et al.  Cerebral Glioma Grading Using Bayesian Network with Features Extracted from Multiple Modalities of Magnetic Resonance Imaging , 2016, PloS one.

[20]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[21]  Huan Liu,et al.  Instance Selection and Construction for Data Mining , 2001 .

[22]  Fawaz S. Al-Anzi,et al.  Efficient Fuzzy Techniques for Medical Data Clustering , 2017, 2017 9th IEEE-GCC Conference and Exhibition (GCCCE).

[23]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[24]  Sotiris B. Kotsiantis,et al.  Supervised Machine Learning: A Review of Classification Techniques , 2007, Informatica.