A hybrid metaheuristic and kernel intuitionistic fuzzy c-means algorithm for cluster analysis

Abstract Cluster analysis is a very useful data mining approach. Although many clustering algorithms have been proposed, it is very difficult to find a clustering method which is suitable for all types of datasets. This study proposes an evolutionary-based clustering algorithm which combines a metaheuristic with a kernel intuitionistic fuzzy c-means (KIFCM) algorithm. The KIFCM algorithm improves the fuzzy c-means (FCM) algorithm by employing an intuitionistic fuzzy set and a kernel function. According to previous studies, the KIFCM algorithm is a promising algorithm. However, it still has a weakness due to its high sensitivity to initial centroids. Thus, this study overcomes this problem by using a metaheuristic algorithm to improve the KIFCM result. The metaheuristic can provide better initial centroids for the KIFCM algorithm. This study applies three metaheuristics, particle swarm optimization (PSO), genetic algorithm (GA) and artificial bee colony (ABC) algorithms. Though the hybrid method is not new, this is the first paper to combine metaheuristics and KIFCM. The proposed algorithms, PSO-KIFCM, GA-KIFCM and ABC-KIFCM algorithms are evaluated using six benchmark datasets. The results are compared with some other clustering algorithms, namely K-means, FCM, Kernel fuzzy c-means (KFCM) and KIFCM algorithms. The results prove that the proposed algorithms achieve better accuracy. Furthermore, the proposed algorithms are applied to solve a case study on customer segmentation. This case study is taken from franchise stores selling women's clothing in Taiwan. For this case study, the proposed algorithms also exhibit better cluster construction than other tested algorithms.

[1]  C. A. Murthy,et al.  In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[2]  Ujjwal Maulik,et al.  Genetic algorithm-based clustering technique , 2000, Pattern Recognit..

[3]  Dervis Karaboga,et al.  A powerful and efficient algorithm for numerical function optimization: artificial bee colony (ABC) algorithm , 2007, J. Glob. Optim..

[4]  Riccardo Poli,et al.  Particle swarm optimization , 1995, Swarm Intelligence.

[5]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[6]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[7]  R. J. Kuo,et al.  Integration of growing self-organizing map and continuous genetic algorithm for grading lithium-ion battery cells , 2012, Appl. Soft Comput..

[8]  Xuelong Li,et al.  Robust Reversible Watermarking via Clustering and Enhanced Pixel-Wise Masking , 2012, IEEE Transactions on Image Processing.

[9]  Lusheng Wang,et al.  Computing the maximum similarity bi-clusters of gene expression data , 2007, Bioinform..

[10]  Miin-Shen Yang A survey of fuzzy clustering , 1993 .

[11]  Limin Fu,et al.  FLAME, a novel fuzzy clustering method for the analysis of DNA microarray data , 2007, BMC Bioinformatics.

[12]  Krassimir T. Atanassov,et al.  Intuitionistic fuzzy sets , 1986 .

[13]  Kuo-Ping Lin,et al.  A Novel Evolutionary Kernel Intuitionistic Fuzzy $C$ -means Clustering Algorithm , 2014, IEEE Transactions on Fuzzy Systems.

[14]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[15]  Ferani E. Zulvia,et al.  An application of a metaheuristic algorithm-based clustering ensemble method to APP customer segmentation , 2016, Neurocomputing.

[16]  Hidetomo Ichihashi,et al.  Linear fuzzy clustering techniques with missing values and their application to local principal component analysis , 2004, IEEE Transactions on Fuzzy Systems.

[17]  O. Mangasarian,et al.  Multisurface method of pattern separation for medical diagnosis applied to breast cytology. , 1990, Proceedings of the National Academy of Sciences of the United States of America.

[18]  Gaofeng Wang,et al.  A Method of Self-Adaptive Inertia Weight for PSO , 2008, 2008 International Conference on Computer Science and Software Engineering.

[19]  Jun Wang,et al.  Single point iterative weighted fuzzy C-means clustering algorithm for remote sensing image segmentation , 2009, Pattern Recognit..

[20]  J. R. Bult,et al.  Optimal Selection for Direct Mail , 1995 .

[21]  Raj Mittra,et al.  Optimal multilayer filter design using real coded genetic algorithms , 1992 .

[22]  Yuhui Shi,et al.  Particle swarm optimization: developments, applications and resources , 2001, Proceedings of the 2001 Congress on Evolutionary Computation (IEEE Cat. No.01TH8546).

[23]  Xinbo Gao,et al.  Robust lossless data hiding using clustering and statistical quantity histogram , 2012, Neurocomputing.

[24]  Ganapati Panda,et al.  A survey on nature inspired metaheuristic algorithms for partitional clustering , 2014, Swarm Evol. Comput..

[25]  W. Loh,et al.  SPLIT SELECTION METHODS FOR CLASSIFICATION TREES , 1997 .

[26]  Dao-Qiang Zhang,et al.  Clustering Incomplete Data Using Kernel-Based Fuzzy C-means Algorithm , 2003, Neural Processing Letters.

[27]  A. M. Natarajan,et al.  A comparative analysis of enhanced Artificial Bee Colony algorithms for data clustering , 2013 .

[28]  Rainer Storn,et al.  Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces , 1997, J. Glob. Optim..

[29]  Marco Dorigo,et al.  Ant colony optimization theory: A survey , 2005, Theor. Comput. Sci..

[30]  Vipin Kumar,et al.  Introduction to Data Mining , 2022, Data Mining and Machine Learning Applications.

[31]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Tamalika Chaira,et al.  A novel intuitionistic fuzzy C means clustering algorithm and its application to medical images , 2011, Appl. Soft Comput..

[33]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[34]  R. J. Kuo,et al.  The gradient evolution algorithm: A new metaheuristic , 2015, Inf. Sci..

[35]  Xin-She Yang,et al.  Engineering Optimization: An Introduction with Metaheuristic Applications , 2010 .

[36]  Dervis Karaboga,et al.  A modified Artificial Bee Colony algorithm for real-parameter optimization , 2012, Inf. Sci..

[37]  Witold Pedrycz,et al.  Collaborative clustering with the use of Fuzzy C-Means and its quantification , 2008, Fuzzy Sets Syst..