The C-Means algorithm has been motive of many extensions since the first publications. The extensions until now consider mainly the following aspects: the selection of initial seeds (centers); the determination of the optimal number of clusters and the use of different functionals for generate the clusters. In this paper it is proposed an extension to the C-means algorithm which considers description of the objects (data) with quantitative and qualitative features, besides consider missing data. These types of descriptions are very frequent in soft sciences as Medicine, Geology, Sociology, Marketing, etc. so the application scope for the proposed algorithm is very wide. The proposed algorithm use similarity functions that may be in function of partial similarity functions consequently allows comparing objects analyzing subdescriptions of the same. Results using standard public databases [2] are showed. In addition, a comparison with classical C-Means algorithm [7] is provided.
[1]
Enrique H. Ruspini,et al.
A New Approach to Clustering
,
1969,
Inf. Control..
[2]
H. Ralambondrainy,et al.
A conceptual version of the K-means algorithm
,
1995,
Pattern Recognit. Lett..
[3]
Robert J. Schalkoff,et al.
Pattern recognition - statistical, structural and neural approaches
,
1991
.
[4]
Richard O. Duda,et al.
Pattern classification and scene analysis
,
1974,
A Wiley-Interscience publication.
[5]
G H Ball,et al.
A clustering technique for summarizing multivariate data.
,
1967,
Behavioral science.