Fuzzy clustering: More than just fuzzification

The initial idea of extending the classical k-means clustering technique to an algorithm that uses membership degrees instead of crisp assignments of data objects to clusters led to the invention of a large variety of new fuzzy clustering algorithms. However, most of these algorithms are concerned with cluster shapes or outliers and could have been defined without any problems in the context of crisp assignments of data objects to clusters. In this paper, we demonstrate that the use of membership degrees for these algorithms - although it is not necessary from the theoretical point of view - is essential for these algorithms to function in practice. With crisp assignments of data objects to clusters these algorithms would get stuck most of the time in a local minimum of their underlying objective function, leading to undesired clustering results. In other contributions it was shown that the use of membership degrees can avoid this problem of local minima but it also introduces new problems, especially for clusters with varying density and for high-dimensional data, at least if fuzzy clustering is carried out with the simple standard fuzzifier.

[1]  Miin-Shen Yang,et al.  A cluster validity index for fuzzy clustering , 2005, Pattern Recognit. Lett..

[2]  Donald Gustafson,et al.  Fuzzy clustering with a fuzzy covariance matrix , 1978, 1978 IEEE Conference on Decision and Control including the 17th Symposium on Adaptive Processes.

[3]  J. A. Hartigan,et al.  A k-means clustering algorithm , 1979 .

[4]  Lale Akarun,et al.  A fuzzy algorithm for color quantization of images , 2002, Pattern Recognit..

[5]  Zoltán Daróczy,et al.  Generalized Information Functions , 1970, Inf. Control..

[6]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[7]  Eivind Hovig,et al.  Tumor classification and marker gene prediction by feature selection and fuzzy c-means clustering using microarray data , 2003, BMC Bioinformatics.

[8]  Frank Klawonn,et al.  Fuzzy c-means in High Dimensional Spaces , 2011, Int. J. Fuzzy Syst. Appl..

[9]  Christian Borgelt,et al.  Effects of Irrelevant Attributes in Fuzzy Clustering , 2005, The 14th IEEE International Conference on Fuzzy Systems, 2005. FUZZ '05..

[10]  Martin D. Levine,et al.  An Algorithm for Detecting Unimodal Fuzzy Sets and Its Application as a Clustering Technique , 1970, IEEE Transactions on Computers.

[11]  Detlef Nauck,et al.  Foundations Of Neuro-Fuzzy Systems , 1997 .

[12]  Rui-Ping Li,et al.  A maximum-entropy approach to fuzzy clustering , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..

[13]  M. Shimura Fuzzy sets concept in rank-ordering objects , 1973 .

[14]  Václav Snásel,et al.  Fuzzy clustering using hybrid fuzzy c-means and fuzzy particle swarm optimization , 2009, 2009 World Congress on Nature & Biologically Inspired Computing (NaBIC).

[15]  R. Kruse,et al.  Statistics with vague data , 1987 .

[16]  James C. Bezdek,et al.  Fuzzy mathematics in pattern classification , 1973 .

[17]  M. K. Tiwari,et al.  Clustering Indian stock market data for portfolio management , 2010, Expert Syst. Appl..

[18]  Jiming Peng,et al.  Advanced Optimization Laboratory Title : Approximating K-means-type clustering via semidefinite programming , 2005 .

[19]  Rudolf Kruse,et al.  Neuro-Fuzzy Systems , 1998 .

[20]  J. C. Peters,et al.  Fuzzy Cluster Analysis : A New Method to Predict Future Cardiac Events in Patients With Positive Stress Tests , 1998 .

[21]  S. P. Lloyd,et al.  Least squares quantization in PCM , 1982, IEEE Trans. Inf. Theory.

[22]  Rudolf Kruse,et al.  Synergies of Soft Computing and Statistics for Intelligent Data Analysis, Proceedings of the 6th International Conference on Soft Methods in Probability and Statistics, SMPS 2012, Konstanz, Germany, October 4-6, 2012 , 2013, SMPS.

[23]  Isak Gath,et al.  Unsupervised Optimal Fuzzy Clustering , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Shinichi Tamura,et al.  Pattern Classification Based on Fuzzy Relations , 1971, IEEE Trans. Syst. Man Cybern..

[25]  Min Min Study of Combined Fuzzy Clustering Algorithm Based on F-Statistics Hierarchy Clustering , 2010 .

[26]  Ricardo J. G. B. Campello,et al.  A fuzzy extension of the Rand index and other related indexes for clustering and classification assessment , 2007, Pattern Recognit. Lett..

[27]  Hidetomo Ichihashi,et al.  Fuzzy c-means clustering with regularization by K-L information , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[28]  R. Viertl Statistical Methods for Fuzzy Data , 2011 .

[29]  Hidetomo Ichihashi,et al.  A unified view of probabilistic PCA and regularized linear fuzzy clustering , 2003, Proceedings of the International Joint Conference on Neural Networks, 2003..

[30]  James C. Bezdek,et al.  Visual cluster validity (VCV) displays for prototype generator clustering methods , 2003, The 12th IEEE International Conference on Fuzzy Systems, 2003. FUZZ '03..

[31]  Hichem Frigui,et al.  Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation. II , 1995, IEEE Trans. Fuzzy Syst..

[32]  M. Puri,et al.  Fuzzy Random Variables , 1986 .

[33]  Zhang Guo-quan,et al.  Novel Fuzzy Clustering-based Image Segmentation with Simultaneous Uneven Illumination Estimation , 2011 .

[34]  Jeff A. Bilmes,et al.  A gentle tutorial of the em algorithm and its application to parameter estimation for Gaussian mixture and hidden Markov models , 1998 .

[35]  Jiang-She Zhang,et al.  Improved possibilistic C-means clustering algorithms , 2004, IEEE Trans. Fuzzy Syst..

[36]  Huibert Kwakernaak,et al.  Fuzzy random variables--II. Algorithms and examples for the discrete case , 1979, Inf. Sci..

[37]  Francesco Masulli,et al.  Soft transition from probabilistic to possibilistic fuzzy clustering , 2006, IEEE Transactions on Fuzzy Systems.

[38]  Gerardo Beni,et al.  A Validity Measure for Fuzzy Clustering , 1991, IEEE Trans. Pattern Anal. Mach. Intell..

[39]  Brian Everitt,et al.  Cluster analysis , 1974 .

[40]  H SCHLOSBERG,et al.  The dimensional analysis of a new series of facial expressions. , 1958, Journal of experimental psychology.

[41]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition , 1992 .

[42]  Michio Sugeno,et al.  A fuzzy-logic-based approach to qualitative modeling , 1993, IEEE Trans. Fuzzy Syst..

[43]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[44]  James C. Bezdek,et al.  Convergence of Alternating Optimization , 2003, Neural Parallel Sci. Comput..

[45]  N. Karayiannis MECA: maximum entropy clustering algorithm , 1994, Proceedings of 1994 IEEE 3rd International Fuzzy Systems Conference.

[46]  Sergei Vassilvitskii,et al.  k-means++: the advantages of careful seeding , 2007, SODA '07.

[47]  F. Klawonn,et al.  Can Fuzzy Clustering Avoid Local Minima and Undesired Partitions , 2013 .

[48]  Shigeru Okuma,et al.  Fuzzy clustering using deterministic annealing method and its statistical mechanical characteristics , 2001, 10th IEEE International Conference on Fuzzy Systems. (Cat. No.01CH37297).

[49]  Béchir el Ayeb,et al.  An efficient approach for building customer profiles from business data , 2010, Expert Syst. Appl..

[50]  Magne Setnes,et al.  Supervised fuzzy clustering for rule extraction , 1999, FUZZ-IEEE'99. 1999 IEEE International Fuzzy Systems. Conference Proceedings (Cat. No.99CH36315).

[51]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[52]  Rajesh N. Davé,et al.  Generalized fuzzy c-shells clustering and detection of circular and elliptical boundaries , 1992, Pattern Recognit..

[53]  Huibert Kwakernaak,et al.  Fuzzy random variables - I. definitions and theorems , 1978, Inf. Sci..

[54]  Michael R. Berthold,et al.  Fuzzy clustering in parallel universes , 2005, NAFIPS 2005 - 2005 Annual Meeting of the North American Fuzzy Information Processing Society.

[55]  Jian Yu,et al.  A Generalized Fuzzy Clustering Regularization Model With Optimality Tests and Model Complexity Analysis , 2007, IEEE Transactions on Fuzzy Systems.

[56]  R. Kruse,et al.  An extension to possibilistic fuzzy cluster analysis , 2004, Fuzzy Sets Syst..

[57]  Ashok Kumar,et al.  Neural Networks for Fast Estimation of Social Network Centrality Measures , 2015 .

[58]  Hidetomo Ichihashi,et al.  Regularized linear fuzzy clustering and probabilistic PCA mixture models , 2005, IEEE Transactions on Fuzzy Systems.

[59]  Michel Verleysen,et al.  Towards Advanced Data Analysis by Combining Soft Computing and Statistics , 2012, SOCO 2012.

[60]  Shyi-Ming Chen,et al.  Multi-variable fuzzy forecasting based on fuzzy clustering and fuzzy rule interpolation techniques , 2010, Inf. Sci..

[61]  Enrique H. Ruspini,et al.  Numerical methods for fuzzy clustering , 1970, Inf. Sci..

[62]  Pierpaolo D'Urso,et al.  Arithmetic and distance-based approach to the statistical analysis of imprecisely valued data , 2013, SOCO 2013.

[63]  Kuo-Lung Wu,et al.  Unsupervised possibilistic clustering , 2006, Pattern Recognit..

[64]  Richard O. Duda,et al.  Pattern classification and scene analysis , 1974, A Wiley-Interscience publication.

[65]  Miin-Shen Yang On a class of fuzzy classification maximum likelihood procedures , 1993 .

[66]  J. Bezdek Numerical taxonomy with fuzzy sets , 1974 .

[67]  Rajesh N. Davé,et al.  Characterization and detection of noise in clustering , 1991, Pattern Recognit. Lett..

[68]  C. E. SHANNON,et al.  A mathematical theory of communication , 1948, MOCO.

[69]  Sadaaki Miyamoto,et al.  Fuzzy Clustering by Quadratic Regularization , 1998 .

[70]  B. Everitt,et al.  Finite Mixture Distributions , 1981 .

[71]  A. Lucieer,et al.  Fuzzy clustering for seafloor classification , 2009 .

[72]  J. Bezdek Cluster Validity with Fuzzy Sets , 1973 .

[73]  Rajesh N. Davé,et al.  Adaptive fuzzy c-shells clustering and detection of ellipses , 1992, IEEE Trans. Neural Networks.

[74]  Constantin Virgil Negoita ON THE NOTION OF RELEVANCE IN INFORMATION RETRIEVAL , 1973 .

[75]  James C. Bezdek,et al.  Optimal Fuzzy Partitions: A Heuristic for Estimating the Parameters in a Mixture of Normal Distributions , 1975, IEEE Transactions on Computers.

[76]  R. Bellman,et al.  Abstraction and pattern classification , 1996 .

[77]  J. Dunn Well-Separated Clusters and Optimal Fuzzy Partitions , 1974 .

[78]  Shyi-Ming Chen,et al.  A new method for fuzzy information retrieval based on fuzzy hierarchical clustering and fuzzy inference techniques , 2005, IEEE Transactions on Fuzzy Systems.

[79]  Lakhmi C. Jain,et al.  Innovations in Fuzzy Clustering - Theory and Applications , 2006, Studies in Fuzziness and Soft Computing.

[80]  Frank Klawonn,et al.  Visual Inspection of Fuzzy Clustering Results , 2003 .

[81]  Hichem Frigui,et al.  Fuzzy and possibilistic shell clustering algorithms and their application to boundary detection and surface approximation. II , 1995, IEEE Trans. Fuzzy Syst..

[82]  Aly A. Farag,et al.  A modified fuzzy c-means algorithm for bias field estimation and segmentation of MRI data , 2002, IEEE Transactions on Medical Imaging.

[83]  Frank Klawonn,et al.  Transcription regulatory region analysis using signal detection and fuzzy clustering , 1998, Bioinform..

[84]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[85]  Tormod Næs,et al.  The flexibility of fuzzy clustering illustrated by examples , 1999 .

[86]  R. Fisher THE USE OF MULTIPLE MEASUREMENTS IN TAXONOMIC PROBLEMS , 1936 .

[87]  R. Abelson,et al.  Multidimensional scaling of facial expressions. , 1962, Journal of experimental psychology.

[88]  R. Davé FUZZY SHELL-CLUSTERING AND APPLICATIONS TO CIRCLE DETECTION IN DIGITAL IMAGES , 1990 .

[89]  Amy J. C. Trappey,et al.  THE ANALYSIS OF CUSTOMER SERVICE CHOICES AND PROMOTION PREFERENCES USING HIERARCHICAL CLUSTERING , 2009 .

[90]  Anupam Joshi,et al.  Low-complexity fuzzy relational clustering algorithms for Web mining , 2001, IEEE Trans. Fuzzy Syst..

[91]  Mauro Barni,et al.  Comments on "A possibilistic approach to clustering" , 1996, IEEE Trans. Fuzzy Syst..

[92]  K. Jajuga L 1 -norm based fuzzy clustering , 1991 .

[93]  Peter J. Rousseeuw,et al.  Finding Groups in Data: An Introduction to Cluster Analysis , 1990 .

[94]  Rajesh N. Davé,et al.  Validating fuzzy partitions obtained through c-shells clustering , 1996, Pattern Recognit. Lett..

[95]  Frank Klawonn,et al.  Guide to Intelligent Data Analysis - How to Intelligently Make Sense of Real Data , 2010, Texts in Computer Science.

[96]  Doulaye Dembélé,et al.  Fuzzy C-means Method for Clustering Microarray Data , 2003, Bioinform..

[97]  James M. Keller,et al.  Fuzzy Models and Algorithms for Pattern Recognition and Image Processing , 1999 .

[98]  Piotr A. Kowalski,et al.  Complete Gradient Clustering Algorithm for Features Analysis of X-Ray Images , 2010 .

[99]  Frank Klawonn,et al.  What Is Fuzzy about Fuzzy Clustering? Understanding and Improving the Concept of the Fuzzifier , 2003, IDA.

[100]  James C. Bezdek,et al.  Fuzzy Kohonen clustering networks , 1994, Pattern Recognit..

[101]  Claude E. Shannon,et al.  A mathematical theory of communication , 1948, MOCO.

[102]  Susmita Datta,et al.  Comparisons and validation of statistical clustering techniques for microarray gene expression data , 2003, Bioinform..

[103]  George Emanuel,et al.  Definitions and Theorems , 1986 .

[104]  Seo Young Kim,et al.  Effect of data normalization on fuzzy clustering of DNA microarray data , 2005, BMC Bioinformatics.

[105]  G H Ball,et al.  A clustering technique for summarizing multivariate data. , 1967, Behavioral science.

[106]  Constantin Virgil Negoita On the application of the fuzzy sets separation theorem for automatic classification in information retrieval systems , 1973, Inf. Sci..

[107]  Hichem Frigui,et al.  Clustering by competitive agglomeration , 1997, Pattern Recognit..

[108]  Dao-Qiang Zhang,et al.  A novel kernelized fuzzy C-means algorithm with application in medical image segmentation , 2004, Artif. Intell. Medicine.

[109]  Chitta Baral,et al.  Fuzzy C-means Clustering with Prior Biological Knowledge , 2022 .

[110]  Enrique H. Ruspini,et al.  A New Approach to Clustering , 1969, Inf. Control..

[111]  Weina Wang,et al.  On fuzzy cluster validity indices , 2007, Fuzzy Sets Syst..

[112]  H. V. Henderson,et al.  Building Multiple Regression Models Interactively , 1981 .

[113]  Christian Borgelt,et al.  Resampling for Fuzzy Clustering , 2007, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[114]  Eyke Hüllermeier,et al.  Fuzzy methods in machine learning and data mining: Status and prospects , 2005, Fuzzy Sets Syst..

[115]  Masao Mukaidono,et al.  Gaussian clustering method based on maximum-fuzzy-entropy interpretation , 1999, Fuzzy Sets Syst..

[116]  Rudolf Kruse,et al.  On the variance of random sets , 1987 .

[117]  Anil K. Jain,et al.  Algorithms for Clustering Data , 1988 .

[118]  Roberto Marcondes Cesar Junior,et al.  Inference from Clustering with Application to Gene-Expression Microarrays , 2002, J. Comput. Biol..

[119]  Ricardo J. G. B. Campello,et al.  A fuzzy extension of the silhouette width criterion for cluster analysis , 2006, Fuzzy Sets Syst..

[120]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[121]  Hichem Frigui,et al.  A comparison of fuzzy shell-clustering methods for the detection of ellipses , 1996, IEEE Trans. Fuzzy Syst..

[122]  Chin-Shyurng Fahn,et al.  The multisynapse neural network and its application to fuzzy clustering , 2002, IEEE Trans. Neural Networks.

[123]  Kuang Yu Huang Applications of an enhanced cluster validity index method based on the Fuzzy C-means and rough set theories to partition and classification , 2010, Expert Syst. Appl..

[124]  Sadaaki Miyamoto,et al.  Fuzzy c-means as a regularization and maximum entropy approach , 1997 .

[125]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[126]  Paolo Giordani,et al.  A toolbox for fuzzy clustering using the R programming language , 2015, Fuzzy Sets Syst..

[127]  Hans Bandemer,et al.  Fuzzy Data Analysis , 1992 .

[128]  Stephen Grossberg,et al.  Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system , 1991, Neural Networks.

[129]  Eyke Hüllermeier,et al.  Fuzzy sets in machine learning and data mining , 2011, Appl. Soft Comput..

[130]  James M. Keller,et al.  A possibilistic fuzzy c-means clustering algorithm , 2005, IEEE Transactions on Fuzzy Systems.

[131]  Rajesh N. Davé,et al.  Robust clustering methods: a unified view , 1997, IEEE Trans. Fuzzy Syst..

[132]  C. Borgelt Objective Functions for Fuzzy Clustering , 2013 .

[133]  Heiko Timm,et al.  A modification to improve possibilistic fuzzy cluster analysis , 2002, 2002 IEEE World Congress on Computational Intelligence. 2002 IEEE International Conference on Fuzzy Systems. FUZZ-IEEE'02. Proceedings (Cat. No.02CH37291).

[134]  N. Boujemaa Generalized competitive clustering for image segmentation , 2000, PeachFuzz 2000. 19th International Conference of the North American Fuzzy Information Processing Society - NAFIPS (Cat. No.00TH8500).

[135]  H. B. Barlow,et al.  Unsupervised Learning , 1989, Neural Computation.

[136]  Rajesh N. Davé,et al.  Robust fuzzy clustering of relational data , 2002, IEEE Trans. Fuzzy Syst..

[137]  Ujjwal Maulik,et al.  An improved algorithm for clustering gene expression data , 2007, Bioinform..

[138]  Chien-Hsing Chou,et al.  Short Papers , 2001 .

[139]  J. C. Dunn,et al.  A Graph Theoretic Analysis of Pattern Classification via Tamura's Fuzzy Relation , 1974, IEEE Trans. Syst. Man Cybern..

[140]  Reinhard Viertl,et al.  Statistical Methods for Fuzzy Data: Viertl/Statistical Methods for Fuzzy Data , 2011 .

[141]  Soon-H. Kwon Cluster validity index for fuzzy clustering , 1998 .

[142]  R. Hathaway Another interpretation of the EM algorithm for mixture distributions , 1986 .

[143]  Kenneth G. Manton,et al.  Fuzzy Cluster Analysis , 2005 .

[144]  Laurence A. Wolsey,et al.  Cutting planes in integer and mixed integer programming , 2002, Discret. Appl. Math..

[145]  Palma Blonda,et al.  A survey of fuzzy clustering algorithms for pattern recognition. I , 1999, IEEE Trans. Syst. Man Cybern. Part B.

[146]  M. K. Luhandjula Studies in Fuzziness and Soft Computing , 2013 .

[147]  Geoffrey J. McLachlan,et al.  Finite Mixture Models , 2019, Annual Review of Statistics and Its Application.