Clustering algorithm based on Condensed Set Dissimilarity for high dimensional sparse data of categorical attributes