Data clustering algorithm based on binary subspace division

Clustering is an important data analyzing method in data mining. We analyzed existing clustering algorithm and raised a new grid density clustering algorithm based on binary subspace division. Region quadtree is a type of spatial data structure based on binary division, we used this structure to 2-dimensional clustering. We also gave out the construction algorithm of region-density tree (RD-quadtree), the region merging algorithm, and the algorithm of calculating the connect component of RD-Quadtree, then extended the algorithm to high-dimensional data space and analyzed the space and time complexity of the RD-quadtree based clustering algorithm. We further proved that the RD-quadtree based clustering algorithm only did grid division of the non-empty space in the high-dimensional data space. It will lower the number of the grid unit drastically and gain higher space and time efficiency.