A density-based clustering over evolving heterogeneous data stream

Data stream clustering is an importance issue in data stream mining. In most of the existing algorithms, only the continuous features are used for clustering. In this paper, we introduce an algorithm HDenStream for clustering data stream with heterogeneous features. The HDenstream is also a density-based algorithm, so it is capable enough to cluster arbitrary shapes and handle outliers. Theoretic analysis and experimental results show that HDenStream is effective and efficient.

[1]  Zenon Chaczko,et al.  Updating Electronic Health Records with Information from Sensor Systems: Considerations Relating To Standards and Architecture Arising From the Development of a Prototype System , 2009, J. Convergence Inf. Technol..

[2]  Sattar B. Sadkhan,et al.  Proposed Simulation of Modulation Identification Based On Wavelet Transform , 2009 .

[3]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[4]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[5]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[6]  Aoying Zhou,et al.  Density-Based Clustering over an Evolving Data Stream with Noise , 2006, SDM.

[7]  Sudipto Guha,et al.  Clustering data streams , 2000, Proceedings 41st Annual Symposium on Foundations of Computer Science.

[8]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[9]  Jie Zhou,et al.  HClustream: A Novel Approach for Clustering Evolving Heterogeneous Data Stream , 2006, Sixth IEEE International Conference on Data Mining - Workshops (ICDMW'06).

[10]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[11]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[12]  Sudipto Guha,et al.  Improved combinatorial algorithms for the facility location and k-median problems , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).

[13]  Vijay V. Vazirani,et al.  Primal-dual approximation algorithms for metric facility location and k-median problems , 1999, 40th Annual Symposium on Foundations of Computer Science (Cat. No.99CB37039).