A Novel Consensus Fuzzy K-Modes Clustering Using Coupling DNA-Chain-Hypergraph P System for Categorical Data

In this paper, a data clustering method named consensus fuzzy k-modes clustering is proposed to improve the performance of the clustering for the categorical data. At the same time, the coupling DNA-chain-hypergraph P system is constructed to realize the process of the clustering. This P system can prevent the clustering algorithm falling into the local optimum and realize the clustering process in implicit parallelism. The consensus fuzzy k-modes algorithm can combine the advantages of the fuzzy k-modes algorithm, weight fuzzy k-modes algorithm and genetic fuzzy k-modes algorithm. The fuzzy k-modes algorithm can realize the soft partition which is closer to reality, but treats all the variables equally. The weight fuzzy k-modes algorithm introduced the weight vector which strengthens the basic k-modes clustering by associating higher weights with features useful in analysis. These two methods are only improvements the k-modes algorithm itself. So, the genetic k-modes algorithm is proposed which used the genetic operations in the clustering process. In this paper, we examine these three kinds of k-modes algorithms and further introduce DNA genetic optimization operations in the final consensus process. Finally, we conduct experiments on the seven UCI datasets and compare the clustering results with another four categorical clustering algorithms. The experiment results and statistical test results show that our method can get better clustering results than the compared clustering algorithms, respectively.

[1]  Xiyu Liu,et al.  Hybrid Chain-Hypergraph P Systems for Multiobjective Ensemble Clustering , 2019, IEEE Access.

[2]  徐晓飞,et al.  Squeezer:An Efficient Algorithm for Clustering Categorical Data , 2002 .

[3]  Johannes Gehrke,et al.  CACTUS—clustering categorical data using summaries , 1999, KDD '99.

[4]  Jiye Liang,et al.  Space Structure and Clustering of Categorical Data , 2016, IEEE Transactions on Neural Networks and Learning Systems.

[5]  Tingfang Wu,et al.  The computation power of spiking neural P systems with polarizations adopting sequential mode induced by minimum spike number , 2020, Neurocomputing.

[6]  Xuesong Wang,et al.  Dual Hypergraph Regularized PCA for Biclustering of Tumor Gene Expression Data , 2019, IEEE Transactions on Knowledge and Data Engineering.

[7]  Sudipto Guha,et al.  ROCK: a robust clustering algorithm for categorical attributes , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[8]  Gheorghe Paun,et al.  Computing with Membranes , 2000, J. Comput. Syst. Sci..

[9]  Myoung Ho Kim,et al.  Efficient Searching of Subhypergraph Isomorphism in Hypergraph Databases , 2018, 2018 IEEE International Conference on Big Data and Smart Computing (BigComp).

[10]  Ujjwal Maulik,et al.  Multiobjective Genetic Algorithm-Based Fuzzy Clustering of Categorical Attributes , 2009, IEEE Transactions on Evolutionary Computation.

[11]  Ronghua Shang,et al.  An intuitionistic fuzzy possibilistic C-means clustering based on genetic algorithm , 2016, 2016 IEEE Congress on Evolutionary Computation (CEC).

[12]  S. Datta,et al.  Variance estimation in tests of clustered categorical data with informative cluster size , 2020, Statistical methods in medical research.

[13]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[14]  Zhiwei Huang,et al.  Automatic Implementation of Fuzzy Reasoning Spiking Neural P Systems for Diagnosing Faults in Complex Power Systems , 2019, Complex..

[15]  He Zengyou,et al.  Squeezer: an efficient algorithm for clustering categorical data , 2002 .

[16]  Zied Chtourou,et al.  Uncertainty mode selection in categorical clustering using the rough set theory , 2020, Expert Syst. Appl..

[17]  Tao Wang,et al.  Weighted Fuzzy Spiking Neural P Systems , 2013, IEEE Transactions on Fuzzy Systems.

[18]  Bernhard Schölkopf,et al.  Learning with Hypergraphs: Clustering, Classification, and Embedding , 2006, NIPS.

[19]  R. J. Kuo,et al.  Non-dominated sorting genetic algorithm using fuzzy membership chromosome for categorical data clustering , 2015, Appl. Soft Comput..

[20]  Xiyu Liu,et al.  Communication P Systems on Simplicial Complexes with Applications in Cluster Analysis , 2012 .

[21]  Minghe Sun,et al.  An Improved Consensus Clustering Algorithm Based on Cell-Like P Systems With Multi-Catalysts , 2020, IEEE Access.

[22]  Jiye Liang,et al.  A weighting k-modes algorithm for subspace clustering of categorical data , 2013, Neurocomputing.

[23]  Panayiotis Tsaparas,et al.  Limbo: A scalable algorithm to cluster categorical data , 2003 .

[24]  Lin Wang,et al.  A Complex Chained P System Based on Evolutionary Mechanism for Image Segmentation , 2020, Comput. Intell. Neurosci..

[25]  Zhiping Zhou,et al.  Kernel-based multiobjective clustering algorithm with automatic attribute weighting , 2018, Soft Comput..

[26]  J. Wu,et al.  A genetic fuzzy k-Modes algorithm for clustering categorical data , 2009, Expert Syst. Appl..

[27]  Jing Luan,et al.  Logic Operation in Spiking Neural P System with Chain Structure , 2014 .

[28]  Li Yang,et al.  A Moving Shape-based Robust Fuzzy K-modes Clustering Algorithm for Electricity Profiles , 2020 .

[29]  Yi Li,et al.  COOLCAT: an entropy-based algorithm for categorical clustering , 2002, CIKM '02.

[30]  R. J. Kuo,et al.  Genetic intuitionistic weighted fuzzy k-modes algorithm for categorical data , 2019, Neurocomputing.

[31]  Artiom Alhazov,et al.  Local Synchronization on Asynchronous Tissue P Systems With Symport/Antiport Rules , 2020, IEEE Transactions on NanoBioscience.

[32]  Hong Peng,et al.  Spiking neural P systems with inhibitory rules , 2020, Knowl. Based Syst..

[33]  Lihong Xu,et al.  Many-objective fuzzy centroids clustering algorithm for categorical data , 2018, Expert Syst. Appl..

[34]  Hui Xiong,et al.  K-Means-Based Consensus Clustering: A Unified View , 2015, IEEE Transactions on Knowledge and Data Engineering.

[35]  Feng Li,et al.  Multi-objective artificial immune algorithm for fuzzy clustering based on multiple kernels , 2017, 2017 IEEE Symposium Series on Computational Intelligence (SSCI).

[36]  Hong Peng,et al.  A novel image thresholding method based on membrane computing and fuzzy entropy , 2013, J. Intell. Fuzzy Syst..

[37]  Jianjun Cao,et al.  From Whole to Part: Reference-Based Representation for Clustering Categorical Data , 2020, IEEE Transactions on Neural Networks and Learning Systems.

[38]  Linqiang Pan,et al.  Tissue-like P systems with evolutional symport/antiport rules , 2017, Inf. Sci..

[39]  Xiyu Liu,et al.  Novel coupled DP system for fuzzy C-means clustering and image segmentation , 2020, Applied Intelligence.

[40]  Swagatam Das,et al.  Categorical fuzzy k-modes clustering with automated feature weight learning , 2015, Neurocomputing.

[41]  Zengyou He,et al.  Squeezer: An efficient algorithm for clustering categorical data , 2008, Journal of Computer Science and Technology.

[42]  Hong Jia,et al.  A New Distance Metric for Unsupervised Learning of Categorical Data , 2014, IEEE Transactions on Neural Networks and Learning Systems.

[43]  A. Bonato,et al.  Graphs and Hypergraphs , 2022 .

[44]  N. Yuvaraj,et al.  High-performance link-based cluster ensemble approach for categorical data clustering , 2018, The Journal of Supercomputing.