Evolving Fuzzy Clustering Approach: An Epoch Clustering That Enables Heuristic Postpruning

Clustering is an unsupervised machine learning method that is used both individually, and as a part of the pre-processing stage for supervised machine learning methods. Due to its unsupervised nature, clustering results have less accuracy compared to supervised learning. This study aims to introduce a new perspective in clustering by defining an approach for data pruning. The method also enables clustering using multiple sets of prototypes instead of only one set to improve clustering accuracy. Consequently, this approach has the potential to be used independently or as a part of a pre-processing to prepare purified data for the training step of a supervised learning approach. EFCA utilizes the fuzzy membership concept to breakdown clustering in epochs instead of running the clustering on all data at once. In some cases, for supervised learning, we rather have a smaller subset of highly accurate labeled data instead of a dataset with less accurate labels. EFCA's „epoch cut‟ enables post pruning ability to eliminate obscure data points which result in more clustering accuracy. EFCA has been applied to a set of 8 multivariate and ten time-series datasets, and for example, after deploying epoch cut and eliminating obscure data (20% of data) by automatic post pruning it achieved 100 percent accuracy for the rest 80% Iris data.

[1]  James C. Bezdek,et al.  A mixed c-means clustering model , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[2]  F. Rhee,et al.  A type-2 fuzzy C-means clustering algorithm , 2001, Proceedings Joint 9th IFSA World Congress and 20th NAFIPS International Conference (Cat. No. 01TH8569).

[3]  Salah Al-Sharhan,et al.  Hybrid Multistage Fuzzy Clustering System for Medical Data Classification , 2018, 2018 International Conference on Computing Sciences and Engineering (ICCSE).

[4]  Mahua Bhattacharya,et al.  Brain Tumor Segmentation with Skull Stripping and Modified Fuzzy C-Means , 2018, Information and Communication Technology for Intelligent Systems.

[5]  Miin-Shen Yang,et al.  A Feature-Reduction Fuzzy Clustering Algorithm Based on Feature-Weighted Entropy , 2018, IEEE Transactions on Fuzzy Systems.

[6]  Anjana Gosain,et al.  Density-oriented approach to identify outliers and get noiseless clusters in Fuzzy C — Means , 2010, International Conference on Fuzzy Systems.

[7]  Krassimir T. Atanassov,et al.  Intuitionistic fuzzy sets , 1986 .

[8]  Jiasong Wu,et al.  Iterative spatial fuzzy clustering for 3D brain magnetic resonance image supervoxel segmentation , 2019, Journal of Neuroscience Methods.

[9]  Marimuthu Palaniswami,et al.  Ensemble Fuzzy Clustering Using Cumulative Aggregation on Random Projections , 2018, IEEE Transactions on Fuzzy Systems.

[10]  James M. Keller,et al.  A possibilistic fuzzy c-means clustering algorithm , 2005, IEEE Transactions on Fuzzy Systems.

[11]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[12]  Zeshui Xu,et al.  Clustering algorithm for intuitionistic fuzzy sets , 2008, Inf. Sci..

[13]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[14]  Francisco Herrera,et al.  A position and perspective analysis of hesitant fuzzy sets on information fusion in decision making. Towards high quality progress , 2016, Inf. Fusion.

[15]  J. Bezdek,et al.  FCM: The fuzzy c-means clustering algorithm , 1984 .

[16]  Moshe Kam,et al.  A noise-resistant fuzzy c means algorithm for clustering , 1998, 1998 IEEE International Conference on Fuzzy Systems Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36228).

[17]  Frank Chung-Hoon Rhee,et al.  Uncertain Fuzzy Clustering: Interval Type-2 Fuzzy Approach to $C$-Means , 2007, IEEE Transactions on Fuzzy Systems.

[18]  Monika Hanesch,et al.  The application of fuzzy C-means cluster analysis and non-linear mapping to a soil data set for the detection of polluted sites , 2001 .

[19]  Aboul Ella Hassanien,et al.  Clustering Time Series Data: An Evolutionary Approach , 2009, Foundations of Computational Intelligence.

[20]  Hui Xiong,et al.  Adapting the right measures for K-means clustering , 2009, KDD.

[21]  Anjana Gosain,et al.  Performance Analysis of Various Fuzzy Clustering Algorithms: A Review , 2016 .

[22]  Abbas Majdi,et al.  Applying evolutionary optimization algorithms for improving fuzzy C-mean clustering performance to predict the deformation modulus of rock mass , 2019, International Journal of Rock Mechanics and Mining Sciences.

[23]  Rajesh N. Davé,et al.  Characterization and detection of noise in clustering , 1991, Pattern Recognit. Lett..

[24]  Francisco Herrera,et al.  Sparse Representation-Based Intuitionistic Fuzzy Clustering Approach to Find the Group Intra-Relations and Group Leaders for Large-Scale Decision Making , 2019, IEEE Transactions on Fuzzy Systems.

[25]  Jianchao Fan,et al.  A Two-Phase Fuzzy Clustering Algorithm Based on Neurodynamic Optimization With Its Application for PolSAR Image Segmentation , 2018, IEEE Transactions on Fuzzy Systems.

[26]  Yan Wang,et al.  Use of Fuzzy Clustering for Discrete Event Simulation Model Construction , 2017 .

[27]  Du-Ming Tsai,et al.  Fuzzy C-means based clustering for linearly and nonlinearly separable data , 2011, Pattern Recognit..

[28]  Zeshui Xu,et al.  Information fusion for intuitionistic fuzzy decision making: An overview , 2016, Information Fusion.

[29]  Anjana Gosain,et al.  Robust Intuitionistic Fuzzy C-means clustering for linearly and nonlinearly separable data , 2011, 2011 International Conference on Image Information Processing.

[30]  Information and Communication Technology for Intelligent Systems , 2019, Smart Innovation, Systems and Technologies.

[31]  Eamonn J. Keogh,et al.  A Novel Bit Level Time Series Representation with Implication of Similarity Search and Clustering , 2005, PAKDD.

[32]  R.N. Dave,et al.  Robust fuzzy clustering algorithms , 1993, [Proceedings 1993] Second IEEE International Conference on Fuzzy Systems.

[33]  Kuo-Ping Lin,et al.  A Novel Evolutionary Kernel Intuitionistic Fuzzy $C$ -means Clustering Algorithm , 2014, IEEE Transactions on Fuzzy Systems.

[34]  Anjana Gosain,et al.  Novel Intuitionistic Fuzzy C-Means Clustering for Li nearly and Nonlinearly Separable Data , 2012 .

[35]  Erzsébet Merényi,et al.  A Validity Index for Prototype-Based Clustering of Data Sets With Complex Cluster Structures , 2011, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[36]  David E. Booth,et al.  The use of fuzzy clustering algorithm and self-organizing neural networks for identifying potentially failing banks: an experimental study , 2000 .

[37]  Milos Manic,et al.  General Type-2 Fuzzy C-Means Algorithm for Uncertain Fuzzy Clustering , 2012, IEEE Transactions on Fuzzy Systems.

[38]  William M. Rand,et al.  Objective Criteria for the Evaluation of Clustering Methods , 1971 .