Subspace Clustering for Vector Clusters

In many real world applications data is collected in multi-dimensional spaces, with the knowledge hidden in subspaces. It is an open research issue to select meaningful subspaces without any prior knowledge about such hidden patterns. Subspace clustering aims at detecting clusters in any projection of a high dimensional data space. However, almost all of the present subspace clustering methods cannot find subspace clusters with arbitrary shape, especially non-axis aligned clusters as we will demonstrate. In this work, we classify subspace clusters into three types: local dense clusters, axis-aligned clusters and non-axis aligned clusters. To tackle the fundamental challenge of missing non-axis aligned clusters, we propose a new subspace clustering algorithm named SCUE (Subspace Clustering based on United Entropy). It computes each 1-dim entropy and united entropy of each two dimensions to form united entropy matrix. Cluster types are judged by entropy thresholds automatically generated from the matrix. Next it searches interesting subspaces in discretized united entropy matrix and gets clusters from interesting subspaces. Experimental results demonstrate that SCUE significantly outperforms present methods in both solution quality and efficiency.