Clustering from Data Streams

Clustering is one of the most popular data mining techniques. In this article, we review the relevant methods and algorithms for designing cluster algorithms under the data streams computational model, and discuss research directions in tracking evolving clusters.

[1]  João Gama,et al.  L2GClust: local-to-global clustering of stream sources , 2011, SAC.

[2]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[3]  Graham Cormode,et al.  Conquering the Divide: Continuous Clustering of Distributed Data Streams , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[4]  Myra Spiliopoulou,et al.  MONIC: modeling and monitoring cluster transitions , 2006, KDD '06.

[5]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[6]  Charles Elkan,et al.  Scalability for clustering algorithms revisited , 2000, SKDD.

[7]  Ira Assent,et al.  The ClusTree: indexing micro-clusters for anytime stream mining , 2011, Knowledge and Information Systems.

[8]  Sami Virpioja BIRCH: Balanced Iterative Reducing and Clustering using Hierarchies , 2008 .

[9]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.

[10]  Christian Sohler,et al.  StreamKM++: A clustering algorithm for data streams , 2010, JEAL.

[11]  João Gama,et al.  Clustering distributed sensor data streams using local processing and reduced communication , 2011, Intell. Data Anal..

[12]  Geoff Hulten,et al.  A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering , 2001, ICML.

[13]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[14]  João Gama,et al.  Clustering Distributed Sensor Data Streams , 2008, ECML/PKDD.

[15]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[16]  João Gama,et al.  Hierarchical Clustering of Time-Series Data Streams , 2008, IEEE Transactions on Knowledge and Data Engineering.