Fuzzy C-Mean Algorithm Based on Mahalanobis Distances and Better Initial Values

The well known fuzzy partition clustering algorithms are most based on Euclidean distance function, which can only be used to detect spherical structural clusters. Gustafson-Kessel (GK) clustering algorithm and Gath-Geva (GG) clustering algorithm, were developed to detect non-spherical structural clusters, but both of them based on semi-supervised Mahalanobis distance, these two algorithms fail to consider the relationships between cluster centers in the objective function, needing additional prior information. When some training cluster size is small than its dimensionality, it induces the singular problem of the inverse covariance matrix. It is an important issue. The other important issue is how to select the better initial value to improve the cluster accuracy. In this paper, focusing attention to above two problems, an improved new algorithm, “Fuzzy C-Mean based on Unsupervised Mahalanobis distance without any prior information (FCM-M)”, is proposed. For selecting the initial value, we proposed a theorem to point out that both of FCM and FCM-M can not exploit all of the memberships with the same value. A real data set was applied to prove that the performance of the FCM-M algorithm is better than the traditional FCM algorithm, .and the ratio method which is proposed by us is the best of the six methods for selecting the initial values.

[1]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.