Rough Set Theory: Approach for Similarity Measure in Cluster Analysis

Clustering of data is an important data mining application. One of the problems with traditional partitioning clustering methods is that they partition the data into hard bound number of clusters. Rough set based Indiscernibility relation combined with indiscernibility graph, leads to knowledge discovery in an elegant way. Indiscernibilty relation has a strong appeal to be applied in clustering as it creates natural clusters in data. Indiscernibility relation is used for measuring the similarity among the data items based on which clustering is performed. In the proposed approach the strict notion of indiscernibility is relaxed and classes are formed on the basis that objects are similar rather then identical. Indiscernibility relation creates indiscernible classes and representation of these classes with indiscernibility graph aids in better representation of clusters.