NPClu: An approach for clustering spatially extended objects

The majority of clustering algorithms deal with collections of data that can be represented as sets of points in the multidimensional Euclidean space. There is a large variety of application domains, such as spatiotemporal databases, medical applications and others, which produce datasets of non-point objects (i.e. objects that occupy a specific hyperspace). Traditional clustering algorithms are mainly based on statistical properties of data and therefore are not able to efficiently partition sets of spatially extended objects. In this paper we propose NPClu, an approach for clustering sets of objects taken into account their geometric and topological properties. The spatial objects are approximated by their MBRs. Then our approach discovers the clusters in the set of the MBRs' vertices based on three steps, that is, pre-processing, clustering and refinement. We experimentally evaluated the performance of our approach to show its effectiveness.

[1]  Sudipto Guha,et al.  ROCK: A Robust Clustering Algorithm for Categorical Attributes , 2000, Inf. Syst..

[2]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[3]  Ramasamy Uthurusamy,et al.  Data mining and knowledge discovery in databases , 1996, CACM.

[4]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[5]  Sergios Theodoridis,et al.  Pattern Recognition, Third Edition , 2006 .

[6]  Aidong Zhang,et al.  WaveCluster: A Multi-Resolution Clustering Approach for Very Large Spatial Databases , 1998, VLDB.

[7]  Howard J. Hamilton,et al.  DBRS: A Density-Based Spatial Clustering Method with Random Sampling , 2003, PAKDD.

[8]  G. Toussaint Solving geometric problems with the rotating calipers , 1983 .

[9]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[10]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[11]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[12]  Jiawei Han,et al.  CLARANS: A Method for Clustering Objects for Spatial Data Mining , 2002, IEEE Trans. Knowl. Data Eng..

[13]  Hans-Peter Kriegel,et al.  Density-Based Clustering in Spatial Databases: The Algorithm GDBSCAN and Its Applications , 1998, Data Mining and Knowledge Discovery.

[14]  Sariel Har-Peled,et al.  Efficiently approximating the minimum-volume bounding box of a point set in three dimensions , 1999, SODA '99.

[15]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[16]  Jack A. Orenstein Spatial query processing in an object-oriented database system , 1986, SIGMOD '86.

[17]  Hans-Peter Kriegel,et al.  The R*-tree: an efficient and robust access method for points and rectangles , 1990, SIGMOD '90.

[18]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[19]  Michael J. A. Berry,et al.  Data mining techniques - for marketing, sales, and customer support , 1997, Wiley computer publishing.

[20]  François Llirbat,et al.  Clustering Multidimensional Extended Objects to Speed Up Execution of Spatial Queries , 2004, EDBT.

[21]  Daniel A. Keim,et al.  An Efficient Approach to Clustering in Large Multimedia Databases with Noise , 1998, KDD.

[22]  Hans-Peter Kriegel,et al.  Efficient processing of spatial joins using R-trees , 1993, SIGMOD Conference.