Evidence Accumulation Clustering with Possibilitic Fuzzy C-Means base clustering approach to disease diagnosis

Traditionally, supervised machine learning methods are the first choice for tasks involving classification of data. This study provides a non-conventional hybrid alternative technique (pEAC) that blends the Possibilistic Fuzzy C- Means (PFCM) as base cluster generating algorithm into the ‘standard’ Evidence Accumulation Clustering (EAC) clustering method. The PFCM coalesces the separate properties of the Possibilistic C-Means (PCM) and Fuzzy C-Means (FCM) algorithms into a sophisticated clustering algorithm. Notwithstanding the tremendous capabilities offered by this hybrid technique, in terms of structure, it resembles the hEAC and fEAC ensemble clustering techniques that are realised by integrating the K-Means and FCM clustering algorithms into the EAC technique. To validate the new technique's effectiveness, its performance on both synthetic and real medical datasets was evaluated alongside individual runs of well-known clustering methods, other unsupervised ensemble clustering techniques and some supervised machine learning methods. Our results show that the proposed pEAC technique outperformed the individual runs of the clustering methods and other unsupervised ensemble techniques in terms accuracy for the diagnosis of hepatitis, cardiovascular, breast cancer, and diabetes ailments that were used in the experiments. Remarkably, compared alongside selected supervised machine learning classification models, our proposed pEAC ensemble technique exhibits better diagnosing accuracy for the two breast cancer datasets that were used, which suggests that even at the cost of none labelling of data, the proposed technique offers efficient medical data classification.

[1]  Ana L. N. Fred,et al.  Evidence Accumulation Clustering Based on the K-Means Algorithm , 2002, SSPR/SPR.

[2]  Hamidah Ibrahim,et al.  A Survey: Clustering Ensembles Techniques , 2009 .

[3]  Xun Jin,et al.  Video fragment format classification using optimized discriminative subspace clustering , 2016, Signal Process. Image Commun..

[4]  Fei Yan,et al.  Fuzzy feature representation for white blood cell differential counting in acute leukemia diagnosis , 2015, International Journal of Control, Automation and Systems.

[5]  Ricardo J. G. B. Campello,et al.  On the efficiency of evolutionary fuzzy clustering , 2009, J. Heuristics.

[6]  Ana L. N. Fred,et al.  Data clustering using evidence accumulation , 2002, Object recognition supported by user interaction for service robots.

[7]  Chastine Fatichah,et al.  A Combined AdaBoost and NEWFM Technique for Medical Data Classification , 2015 .

[8]  Kaoru Hirota,et al.  Similarity-Based Fuzzy Classification of ECG and Capnogram Signals , 2013, J. Adv. Comput. Intell. Intell. Informatics.

[9]  Geoffrey Holmes,et al.  Clustering for classification , 2011, 2011 7th International Conference on Information Technology in Asia.

[10]  Nong Sang,et al.  Hand-written Numeral Recognition Based on Fuzzy C-means Algorithm , 2010, 2010 Ninth International Symposium on Distributed Computing and Applications to Business, Engineering and Science.

[11]  Chastine Fatichah,et al.  A Hybrid Particle Swarm Optimization and Neural Network with Fuzzy Membership Function Technique for Epileptic Seizure Classification , 2015, J. Adv. Comput. Intell. Intell. Informatics.

[12]  Chastine Fatichah,et al.  A Bi-Stage Technique for Segmenting Cervical Smear Images Using Possibilistic Fuzzy C-Means and Mathematical Morphology , 2016 .

[13]  Charles Elkan,et al.  Boosting and Naive Bayesian learning , 1997 .

[14]  Frank Nielsen,et al.  On weighting clustering , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15]  Byoung-Tak Zhang,et al.  Ensemble Learning with Active Example Selection for Imbalanced Biomedical Data Classification , 2011, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[16]  Ching-Hsue Cheng,et al.  OWA-weighted based clustering method for classification problem , 2009, Expert Syst. Appl..

[17]  Tsaipei Wang Comparing hard and fuzzy c-means for evidence-accumulation clustering , 2009, 2009 IEEE International Conference on Fuzzy Systems.

[18]  Chee Peng Lim,et al.  A hybrid intelligent system for medical data classification , 2014, Expert Syst. Appl..

[19]  Sebastián Ventura,et al.  Classification via clustering for predicting final marks starting from the student participation in Forums , 2012, EDM.

[20]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[21]  Reza Boostani,et al.  A new approach for EEG signal classification of schizophrenic and control participants , 2011, Expert Syst. Appl..

[22]  Ana L. N. Fred,et al.  Combining multiple clusterings using evidence accumulation , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Ivor W. Tsang,et al.  Core Vector Machines: Fast SVM Training on Very Large Data Sets , 2005, J. Mach. Learn. Res..

[24]  Ron Kohavi,et al.  Error-Based and Entropy-Based Discretization of Continuous Features , 1996, KDD.

[25]  Mohamed El Bachir Menai,et al.  Hybrid Metaheuristics for Medical Data Classification , 2013, Hybrid Metaheuristics.

[26]  James M. Keller,et al.  A possibilistic fuzzy c-means clustering algorithm , 2005, IEEE Transactions on Fuzzy Systems.

[27]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[28]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[29]  Dongkyoo Shin,et al.  A Comparative Study of Medical Data Classification Methods Based on Decision Tree and Bagging Algorithms , 2009, 2009 Eighth IEEE International Conference on Dependable, Autonomic and Secure Computing.

[30]  Chastine Fatichah,et al.  Principal component analysis-based neural network with fuzzy membership function for epileptic seizure detection , 2014, 2014 10th International Conference on Natural Computation (ICNC).

[31]  Daniel Rivero,et al.  Automatic epileptic seizure detection in EEGs based on line length feature and artificial neural networks , 2010, Journal of Neuroscience Methods.

[32]  Anil K. Jain,et al.  Clustering ensembles: models of consensus and weak partitions , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.