Survey on Data Streams Clustering Techniques

Data stream in a popular research topic in big data era. There are many research results on data stream clustering domain. This paper firstly has a brief introduction to data stream methodologies, such as sampling, sliding windows, etc. Finally, it presents a survey on data streams clustering techniques.

[1]  Sudipto Guha,et al.  Streaming-data algorithms for high-quality clustering , 2002, Proceedings 18th International Conference on Data Engineering.

[2]  Peter J. Rousseeuw,et al.  Clustering by means of medoids , 1987 .

[3]  Philip S. Yu,et al.  A Framework for Projected Clustering of High Dimensional Data Streams , 2004, VLDB.

[4]  Geoff Hulten,et al.  A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering , 2001, ICML.

[5]  Tian Zhang,et al.  BIRCH: A New Data Clustering Algorithm and Its Applications , 1997, Data Mining and Knowledge Discovery.

[6]  Rina Panigrahy,et al.  Better streaming algorithms for clustering problems , 2003, STOC '03.

[7]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[8]  Mohamed Medhat Gaber,et al.  On-board Mining of Data Streams in Sensor Networks , 2005 .

[9]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[10]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[11]  Carlos Ordonez,et al.  Clustering binary data streams with K-means , 2003, DMKD '03.

[12]  Ming-Syan Chen,et al.  Adaptive Clustering for Multiple Evolving Streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[13]  Sudipto Guha,et al.  CURE: an efficient clustering algorithm for large databases , 1998, SIGMOD '98.

[14]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[15]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[16]  Jiawei Han,et al.  Efficient and Effective Clustering Methods for Spatial Data Mining , 1994, VLDB.

[17]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[18]  Michael R. Anderberg,et al.  Cluster Analysis for Applications , 1973 .

[19]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[20]  Rajeev Motwani,et al.  Maintaining variance and k-medians over data stream windows , 2003, PODS.