A k-populations algorithm for clustering categorical data
暂无分享,去创建一个
In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments.
[1] Catherine Blake,et al. UCI Repository of machine learning databases , 1998 .
[2] Michael K. Ng,et al. A fuzzy k-modes algorithm for clustering categorical data , 1999, IEEE Trans. Fuzzy Syst..
[3] Joshua Zhexue Huang,et al. Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.
[4] Edwin Diday,et al. Symbolic clustering using a new dissimilarity measure , 1991, Pattern Recognit..