Fuzzy Cluster Analysis of Larger Data Sets

The application of fuzzy cluster analysis to larger data sets can cause runtime and memory overflow problems. While deterministic or hard clustering assigns a data object to a unique cluster, fuzzy clustering distributes the membership of a data object over different clusters. In standard fuzzy clustering, membership degrees will (almost) never become zero, so that all data objects are assigned to − even with very small membership degrees − all clusters. As a consequence, this does not only demand higher computational and memory power, it also leads to the undesired effect that all data objects will always influence all clusters, no matter how far away they are from a cluster. New approaches, modifying the idea of the fuzzifier, have been developed to avoid the problem of nonzero membership degrees for all data and clusters. In this paper, these ideas will be combined with concepts of speeding up fuzzy clustering by a suitable data organization, so that fuzzy clustering can be applied more efficiently to larger data sets.

[1]  J. C. Dunn,et al.  A Fuzzy Relative of the ISODATA Process and Its Use in Detecting Compact Well-Separated Clusters , 1973 .

[2]  Lawrence O. Hall,et al.  Fast clustering with application to fuzzy rule generation , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..

[3]  Andrew Smellie,et al.  Accelerated K-Means Clustering in Metric Spaces , 2004, J. Chem. Inf. Model..

[4]  James C. Bezdek,et al.  Extending fuzzy and probabilistic clustering to very large data sets , 2006, Comput. Stat. Data Anal..

[5]  Charles Elkan,et al.  Using the Triangle Inequality to Accelerate k-Means , 2003, ICML.

[6]  M. Schwarz,et al.  Otto-von-Guericke-University of Magdeburg , 2007 .

[7]  Frank Klawonn,et al.  An alternative approach to the fuzzifier in fuzzy clustering to obtain better clustering , 2003, EUSFLAT Conf..

[8]  James C. Bezdek,et al.  Efficient Implementation of the Fuzzy c-Means Clustering Algorithms , 1986, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Lawrence O. Hall,et al.  Fast Accurate Fuzzy Clustering through Data Reduction , 2003 .

[10]  Its'hak Dinstein,et al.  Accelerated fuzzy C-means clustering algorithm , 1996, Defense + Commercial Sensing.

[11]  Andrew W. Moore,et al.  Accelerating exact k-means algorithms with geometric reasoning , 1999, KDD '99.

[12]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[13]  Frank Höppner Speeding up fuzzy c-means: using a hierarchical data organisation to control the precision of membership calculation , 2002, Fuzzy Sets Syst..

[14]  Enrique H. Ruspini,et al.  A New Approach to Clustering , 1969, Inf. Control..

[15]  Kenneth G. Manton,et al.  Fuzzy Cluster Analysis , 2005 .

[16]  James M. Keller,et al.  Fuzzy Models and Algorithms for Pattern Recognition and Image Processing , 1999 .

[17]  Frank Klawonn,et al.  What Is Fuzzy about Fuzzy Clustering? Understanding and Improving the Concept of the Fuzzifier , 2003, IDA.

[18]  J. C. Peters,et al.  Fuzzy Cluster Analysis : A New Method to Predict Future Cardiac Events in Patients With Positive Stress Tests , 1998 .