Fuzzy clustering of incomplete nominal and numerical data

This paper defines a new distance based on the improved Levenshtein distance with the tolerance relation for incomplete nominal data, and a new similarity strategy for incomplete numerical data. Additionally, by these two dissimilarity measures, a new distance, which measures the dissimilarity of objects with nominal and numerical attributes, is constructed. Furthermore, a new hierachical clustering model is also presented for classifying incomplete nominal and numerical data. The model need not to be specified the number of cluster centers. Experimental results show that our method clusters incomplete nominal and numerical data with polynomial time complexity and behaves better in classification of objects than Hirano's method on the balloon data set.