Fuzzy non-metric model for data with tolerance and its application to incomplete data clustering

Clustering is a technique of unsupervised classification. The methods are classified into two types, one is hierarchical and the other is non-hierarchical. Fuzzy non-metric model (FNM) is a representative method of non-hierarchical clustering. FNM is very useful because belongingness or the membership degree of each datum to each cluster is calculated directly from dissimilarities between data, and cluster centers are not used. However FNM cannot handle data with uncertainty, called uncertain data, e.g. incomplete data, or data which have errors. In order to handle such data, concept of tolerance vector has been proposed. The clustering methods using the concept can handle the uncertain data in the framework of optimization, e.g. fuzzy c-means for data with tolerance (FCM-T). In this paper, we will first propose new clustering algorithm to apply the concept of tolerance to FNM, called fuzzy non-metric model for data with tolerance (FNM-T). Second, we will show that the proposed algorithm handle incomplete data sets. Third, we will verify the effectiveness of the proposed algorithm in comparison with conventional ones for incomplete data sets through some numerical examples.

[1]  M. Roubens Pattern classification problems and fuzzy sets , 1978 .

[2]  Frank Klawonn,et al.  Guide to Intelligent Data Analysis - How to Intelligently Make Sense of Real Data , 2010, Texts in Computer Science.

[3]  Sadaaki Miyamoto,et al.  Fuzzy c-means for data with tolerance on L1 space , 2006 .

[4]  James C. Bezdek,et al.  Fuzzy c-means clustering of incomplete data , 2001, IEEE Trans. Syst. Man Cybern. Part B.

[5]  James C. Bezdek,et al.  Pattern Recognition with Fuzzy Objective Function Algorithms , 1981, Advanced Applications in Pattern Recognition.

[6]  Yuchi Kanzawa,et al.  Fuzzy c-Means Clustering for Uncertain Data Using Quadratic Penalty-Vector Regularization , 2011, J. Adv. Comput. Intell. Intell. Informatics.

[7]  Sadaaki Miyamoto,et al.  Fuzzy c-Means Algorithms for Data with Tolerance Based on Opposite Criterions , 2007, IEICE Trans. Fundam. Electron. Commun. Comput. Sci..

[8]  John K. Dixon,et al.  Pattern Recognition with Partly Missing Data , 1979, IEEE Transactions on Systems, Man, and Cybernetics.