New fuzzy C-means clustering method based on feature-weight and cluster-weight learning

Abstract Among fuzzy clustering methods, fuzzy c-means (FCM) is the most recognized algorithm. In this algorithm, it is assumed that all the features are of equal importance. In real applications, however, the importance of the features are different and there exist some features that are more important than the others. These important features should basically have more effects than the other features in the forming of optimal clusters. The basic FCM algorithm does not support this idea. Also, the FCM algorithm suffers from another problem; the algorithm is very sensitive to initialization, whereas a bad initialization leads to a poor local optima. Some improved versions of FCM have been proposed in the literature, each of which has somehow mitigated the first problem or the second one. In this paper, motivated by these weaknesses of the FCM, the goal is to solve the two problems at the same time. In doing so, an automatic local feature weighting scheme is proposed to properly weight the features of each clusters. And, a cluster weighting process is performed to mitigate the initialization sensitivity of the FCM. Feature weighting and cluster weighting are performed simultaneously and automatically during the clustering process resulting in high quality clusters, regardless of the initial centers. Extensive experiments conducted on a synthetic dataset and 16 real world datasets indicate that the proposed algorithm outperforms the state-of-the-arts algorithms. The convergence proof of the proposed algorithm is also provided.

[1]  J. Carroll,et al.  Synthesized clustering: A method for amalgamating alternative clustering bases with differential weighting of variables , 1984 .

[2]  Miin-Shen Yang,et al.  Alternative c-means clustering algorithms , 2002, Pattern Recognit..

[3]  Miin-Shen Yang,et al.  Bootstrapping approach to feature-weight selection in fuzzy c-means algorithms with an application in color image segmentation , 2008, Pattern Recognit. Lett..

[4]  Joydeep Ghosh,et al.  Frequency-sensitive competitive learning for scalable balanced clustering on high-dimensional hyperspheres , 2004, IEEE Transactions on Neural Networks.

[5]  Sansanee Auephanwiriyakul,et al.  A string grammar fuzzy-possibilistic C-medians , 2017, Appl. Soft Comput..

[6]  Nor Ashidi Mat Isa,et al.  Color image segmentation using histogram thresholding - Fuzzy C-means hybrid approach , 2011, Pattern Recognit..

[7]  Xiao-Jun Zeng,et al.  Fuzzy C-means++: Fuzzy C-means with effective seeding initialization , 2015, Expert Syst. Appl..

[8]  Yadong Wang,et al.  Improving fuzzy c-means clustering based on feature-weight learning , 2004, Pattern Recognit. Lett..

[9]  Michael K. Ng,et al.  Automated variable weighting in k-means type clustering , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  A. Govardhan,et al.  Experiments on Hypothesis "Fuzzy K-Means is Better than K-Means for Clustering" , 2014 .

[11]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[12]  Om Prakash Mahela,et al.  Recognition of power quality disturbances using S-transform based ruled decision tree and fuzzy C-means clustering classifiers , 2017, Appl. Soft Comput..

[13]  Pedro Larrañaga,et al.  An empirical comparison of four initialization methods for the K-Means algorithm , 1999, Pattern Recognit. Lett..

[14]  Daoqiang Zhang,et al.  Locality sensitive C-means clustering algorithms , 2010, Neurocomputing.

[15]  Rehab Duwairi,et al.  A novel approach for initializing the spherical K-means clustering algorithm , 2015, Simul. Model. Pract. Theory.

[16]  Patricio A. Vela,et al.  A Comparative Study of Efficient Initialization Methods for the K-Means Clustering Algorithm , 2012, Expert Syst. Appl..

[17]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[18]  P. Green,et al.  A preliminary study of optimal variable weighting in k-means clustering , 1990 .

[19]  Francesco Masulli,et al.  A survey of kernel and spectral methods for clustering , 2008, Pattern Recognit..

[20]  Adil M. Bagirov,et al.  Fast modified global k-means algorithm for incremental cluster construction , 2011, Pattern Recognit..

[21]  Renata M. C. R. de Souza,et al.  Multivariate Fuzzy C-Means algorithms with weighting , 2016, Neurocomputing.

[22]  Feng Zhao,et al.  Robust Local Feature Weighting Hard C-Means Clustering Algorithm , 2011, IScIDE.

[23]  Yuhui Zheng,et al.  An improved anisotropic hierarchical fuzzy c-means method based on multivariate student t-distribution for brain MRI segmentation , 2016, Pattern Recognit..

[24]  Bekir Karlik,et al.  Fuzzy c-means based support vector machines classifier for perfume recognition , 2016, Appl. Soft Comput..

[25]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.

[26]  Francisco de A. T. de Carvalho,et al.  Kernel fuzzy c-means with automatic variable weighting , 2014, Fuzzy Sets Syst..

[27]  Aristidis Likas,et al.  The Global Kernel $k$-Means Algorithm for Clustering in Feature Space , 2009, IEEE Transactions on Neural Networks.

[28]  Michael K. Ng,et al.  An optimization algorithm for clustering using weighted dissimilarity measures , 2004, Pattern Recognit..

[29]  Jing Hua,et al.  Localized feature selection for clustering , 2008, Pattern Recognit. Lett..

[30]  T. Velmurugan,et al.  Performance based analysis between k-Means and Fuzzy C-Means clustering algorithms for connection oriented telecommunication data , 2014, Appl. Soft Comput..

[31]  Hichem Frigui,et al.  Unsupervised learning of prototypes and attribute weights , 2004, Pattern Recognit..

[32]  Korris Fu-Lai Chung,et al.  Generalized Fuzzy C-Means Clustering Algorithm With Improved Fuzzy Partitions , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[33]  W. Scott Spangler,et al.  Feature Weighting in k-Means Clustering , 2003, Machine Learning.

[34]  Sergei Vassilvitskii,et al.  Scalable K-Means++ , 2012, Proc. VLDB Endow..

[35]  Niva Das,et al.  Modified possibilistic fuzzy C-means algorithms for segmentation of magnetic resonance image , 2016, Appl. Soft Comput..

[36]  Hong-Jie Xing,et al.  Further improvements in Feature-Weighted Fuzzy C-Means , 2014, Inf. Sci..

[37]  Rui Xu,et al.  Survey of clustering algorithms , 2005, IEEE Transactions on Neural Networks.

[38]  Adil M. Bagirov,et al.  Modified global k-means algorithm for minimum sum-of-squares clustering problems , 2008, Pattern Recognit..

[39]  James C. Bezdek,et al.  A Convergence Theorem for the Fuzzy ISODATA Clustering Algorithms , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[41]  Nikos A. Vlassis,et al.  The global k-means clustering algorithm , 2003, Pattern Recognit..

[42]  Shitong Wang,et al.  Attribute weighted mercer kernel based fuzzy clustering algorithm for general non-spherical datasets , 2006, Soft Comput..

[43]  Michael K. Ng,et al.  An Entropy Weighting k-Means Algorithm for Subspace Clustering of High-Dimensional Sparse Data , 2007, IEEE Transactions on Knowledge and Data Engineering.

[44]  James C. Bezdek,et al.  Objective Function Clustering , 1981 .

[45]  Feng Tian,et al.  Evaluation and integration of cancer gene classifiers: identification and ranking of plausible drivers , 2015, Scientific Reports.

[46]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[47]  Chieh-Yuan Tsai,et al.  Developing a feature weight self-adjustment mechanism for a K-means clustering algorithm , 2008, Comput. Stat. Data Anal..

[48]  Zhiping Zhou,et al.  Kernel-based multiobjective clustering algorithm with automatic attribute weighting , 2018, Soft Comput..

[49]  Yuan Zhang,et al.  Fuzzy clustering with the entropy of attribute weights , 2016, Neurocomputing.

[50]  Mohammad Hossein Fazel Zarandi,et al.  Generalized Possibilistic Fuzzy C-Means with novel cluster validity indices for clustering noisy data , 2017, Appl. Soft Comput..

[51]  Qiang Chen,et al.  Robust spatially constrained fuzzy c-means algorithm for brain MR image segmentation , 2014, Pattern Recognit..

[52]  Aristidis Likas,et al.  The MinMax k-Means clustering algorithm , 2014, Pattern Recognit..

[53]  Francisco J. Valverde-Albacete,et al.  100% Classification Accuracy Considered Harmful: The Normalized Information Transfer Factor Explains the Accuracy Paradox , 2014, PloS one.