论文信息 - An Attribute Based Storage Method for Speeding up CLIQUE Algorithm for Subspace Clustering

An Attribute Based Storage Method for Speeding up CLIQUE Algorithm for Subspace Clustering

The subspace clustering algorithm CLIQUE finds all subspace clusters including overlapping clusters existing in high dimensional datasets. CLIQUE consists of three main steps namely - (1) identification of subspaces that contain clusters, (2) identification of clusters and (3) generation of the minimal description for the clusters obtained in step two. In this paper, we have presented a method for speeding-up the first step of the CLIQUE algorithm. The proposed method is based on accessing the data from columns instead of rows. It is very efficient when there are many missing values in the high dimensional datasets given in the form of table. We have also proposed a depth-first method to find the maximal dense units, to further improve the performance of the first step

Jyoti Pawar | P. R. Rao

[1] Dimitrios Gunopulos,et al. Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[2] Mohammed J. Zaki,et al. GenMax: An Efficient Algorithm for Mining Maximal Frequent Itemsets , 2005, Data Mining and Knowledge Discovery.