Non-Extensive Thermostatistics and Extreme Physical Information for Fuzzy Clustering

Clustering is a widely used knowledge discovery technique. It is used to reveal structures in data that can be extremely useful to the analyst. Partitional clustering attempts to subdivide the data set into subsets or clusters, which are pairwise disjoint, all non empty, and produce the original data set via union. Fuzzy algorithms have been widely studied and applied in this area. In this paper, we focus on fuzzy objective function models whose aim is to assign the data to clusters so that a given objective function is optimized. They can be divided into two categories: probabilistic and possibilistic approaches, each one of them has its own formal and interpretational advantages that are fixed by the specific problem under consideration. We propose a new approach in fuzzy clustering and show how it can be used to obtain a systematic method deriving objective functions. This approach is based on a unifying principle of physics, that of extreme physical information (EPI) defined by Frieden [11], and is inspired by the work of Frieden and A. Plastino, A.R. Plastino and Miller [33], who extend the principle of extremal information in the framework of the non-extensive thermostatistics. The information in question is the trace of the Fisher information matrix for the estimation procedure; this information is shown to be a physical measure of disorder. Then, we show how, with the help of EPI, one can propose an unification/extension of the probabilistic and possibilistic approaches.

[1]  James C. Bezdek,et al.  A mixed c-means clustering model , 1997, Proceedings of 6th International Fuzzy Systems Conference.

[2]  Rajesh N. Davé,et al.  Robust clustering methods: a unified view , 1997, IEEE Trans. Fuzzy Syst..

[3]  Stephen L. Chiu,et al.  Selecting Input Variables for Fuzzy Models , 1996, J. Intell. Fuzzy Syst..

[4]  James C. Bezdek,et al.  Optimization of clustering criteria by reformulation , 1995, IEEE Trans. Fuzzy Syst..

[5]  Sankar K. Pal,et al.  Fuzzy models for pattern recognition , 1992 .

[6]  A. Schneider Weighted possibilistic c-means clustering algorithms , 2000, Ninth IEEE International Conference on Fuzzy Systems. FUZZ- IEEE 2000 (Cat. No.00CH37063).

[7]  Geoffrey C. Fox,et al.  A deterministic annealing approach to clustering , 1990, Pattern Recognit. Lett..

[8]  Michel Ménard,et al.  Possibilistic and probabilistic fuzzy clustering: unification within the framework of the non-extensive thermostatistics , 2003, Pattern Recognit..

[9]  Sadaaki Miyamoto,et al.  An overview and new methods in fuzzy clustering , 1998, 1998 Second International Conference. Knowledge-Based Intelligent Electronic Systems. Proceedings KES'98 (Cat. No.98EX111).

[10]  Sadaaki Miyamoto,et al.  Fuzzy Clustering by Quadratic Regularization , 1998 .

[11]  Sadaaki Miyamoto,et al.  Fuzzy c-means as a regularization and maximum entropy approach , 1997 .

[12]  C. Tsallis Entropic nonextensivity: a possible measure of complexity , 2000, cond-mat/0010150.

[13]  R. Yager,et al.  Approximate Clustering Via the Mountain Method , 1994, IEEE Trans. Syst. Man Cybern. Syst..

[14]  Enrique H. Ruspini,et al.  A New Approach to Clustering , 1969, Inf. Control..

[15]  Miin-Shen Yang A survey of fuzzy clustering , 1993 .

[16]  R.J. Hathaway,et al.  Switching regression models and fuzzy clustering , 1993, IEEE Trans. Fuzzy Syst..

[17]  Stephen L. Chiu,et al.  Fuzzy Model Identification Based on Cluster Estimation , 1994, J. Intell. Fuzzy Syst..

[18]  Pierre Loonis,et al.  The fuzzy c+2-means: solving the ambiguity rejection in clustering , 2000, Pattern Recognit..

[19]  Kenneth G. Manton,et al.  Fuzzy Cluster Analysis , 2005 .

[20]  Abraham Kandel,et al.  Fuzzy Partition and Fuzzy Rule Base , 1998, Inf. Sci..

[21]  A. Kandel Fuzzy Mathematical Techniques With Applications , 1986 .

[22]  Michel Ménard,et al.  Extreme physical information and objective function in fuzzy clustering , 2002, Fuzzy Sets Syst..

[23]  Hidetomo Ichihashi,et al.  Gaussian Mixture PDF Approximation and Fuzzy c-Means Clustering with Entropy Regularization , 2000 .

[24]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[25]  James M. Keller,et al.  A possibilistic approach to clustering , 1993, IEEE Trans. Fuzzy Syst..

[26]  James C. Bezdek,et al.  Validity-guided (re)clustering with applications to image segmentation , 1996, IEEE Trans. Fuzzy Syst..

[27]  James M. Keller,et al.  The possibilistic C-means algorithm: insights and recommendations , 1996, IEEE Trans. Fuzzy Syst..

[28]  C. Tsallis Possible generalization of Boltzmann-Gibbs statistics , 1988 .

[29]  Abraham Kandel,et al.  Introduction to Pattern Recognition: Statistical, Structural, Neural and Fuzzy Logic Approaches , 1999 .

[30]  Michel Ménard,et al.  Fuzzy clustering and switching regression models using ambiguity and distance rejects , 2001, Fuzzy Sets Syst..

[31]  A. Plastino,et al.  Tsallis nonextensive thermostatistics and Fisher's information measure , 1997 .

[32]  Xinbo Gao,et al.  Advances in theory and applications of fuzzy clustering , 2000 .

[33]  F. Pennini,et al.  The Frieden–Soffer extreme physical information principle in a non-extensive setting , 1999 .

[34]  Rui-Ping Li,et al.  A maximum-entropy approach to fuzzy clustering , 1995, Proceedings of 1995 IEEE International Conference on Fuzzy Systems..

[35]  B. Frieden,et al.  Physics from Fisher Information: A Unification , 1998 .

[36]  B. R. Frieden Fisher information as a measure of time , 1996 .

[37]  Assaf Gottlieb,et al.  Algorithm for data clustering in pattern recognition problems based on quantum mechanics. , 2001, Physical review letters.

[38]  Vladik Kreinovich,et al.  Optimal Choices of Potential Functions in Fuzzy Clustering , 1998 .