Density Based Subspace Clustering over Dynamic Data

Modern data are often high dimensional and dynamic. Subspace clustering aims at finding the clusters and the dimensions of the high dimensional feature space where these clusters exist. So far, the subspace clustering methods are mainly static and cannot address the dynamic nature of modern data. In this paper, we propose a dynamic subspace clustering method, which extends the density based projected clustering algorithm PreDeCon for dynamic data. The proposed method efficiently examines only those clusters that might be affected due to the population update. Both single and batch updates are considered.

[1]  Hans-Peter Kriegel,et al.  Incremental OPTICS: Efficient Computation of Updates in a Hierarchical Cluster Ordering , 2003, DaWaK.

[2]  Bernhard Liebl,et al.  Very high compliance in an expanded MS-MS-based newborn screening program despite written parental consent. , 2002, Preventive medicine.

[3]  Dimitrios Gunopulos,et al.  Automatic subspace clustering of high dimensional data for data mining applications , 1998, SIGMOD '98.

[4]  Hans-Peter Kriegel,et al.  OPTICS: ordering points to identify the clustering structure , 1999, SIGMOD '99.

[5]  Johannes Gehrke,et al.  Querying and mining data streams: you only get one look a tutorial , 2002, SIGMOD '02.

[6]  Jing Gao,et al.  An Incremental Data Stream Clustering Algorithm Based on Dense Units Detection , 2005, PAKDD.

[7]  Johannes Gehrke,et al.  DEMON: mining and monitoring evolving data , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[8]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.

[9]  Hans-Peter Kriegel,et al.  Incremental Clustering for Mining in a Data Warehousing Environment , 1998, VLDB.

[10]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[11]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[12]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[13]  Yen-Jen Oyang,et al.  An Incremental Hierarchical Data Clustering Algorithm Based on Gravity Theory , 2002, PAKDD.

[14]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[15]  Rajeev Motwani,et al.  Incremental Clustering and Dynamic Information Retrieval , 2004, SIAM J. Comput..

[16]  Christian Böhm,et al.  Density connected clustering with local subspace preferences , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[17]  Myra Spiliopoulou,et al.  MONIC: modeling and monitoring cluster transitions , 2006, KDD '06.

[18]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[19]  A. Zimek,et al.  Towards subspace clustering on dynamic data: an incremental version of PreDeCon , 2010, StreamKDD '10.

[20]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[21]  Charu C. Aggarwal,et al.  On change diagnosis in evolving data streams , 2005, IEEE Transactions on Knowledge and Data Engineering.

[22]  Hans-Peter Kriegel,et al.  Clustering high-dimensional data: A survey on subspace clustering, pattern-based clustering, and correlation clustering , 2009, TKDD.