Clustering from Data Streams

Clustering is one of the most popular data mining techniques. In this article, we review the relevant methods and algorithms for designing cluster algorithms under the data streams computational model, and discuss research directions in tracking evolving clusters.

[1]  André Carlos Ponce de Leon Ferreira de Carvalho,et al.  Data stream clustering: A survey , 2013, CSUR.

[2]  Christian Sohler,et al.  StreamKM++: A clustering algorithm for data streams , 2010, JEAL.

[3]  João Gama,et al.  L2GClust: local-to-global clustering of stream sources , 2011, SAC.

[4]  João Gama,et al.  Clustering distributed sensor data streams using local processing and reduced communication , 2011, Intell. Data Anal..

[5]  Ira Assent,et al.  The ClusTree: indexing micro-clusters for anytime stream mining , 2011, Knowledge and Information Systems.

[6]  Jesús S. Aguilar-Ruiz,et al.  Knowledge discovery from data streams , 2009, Intell. Data Anal..

[7]  João Gama,et al.  Clustering Distributed Sensor Data Streams , 2008, ECML/PKDD.

[8]  João Gama,et al.  Hierarchical Clustering of Time-Series Data Streams , 2008, IEEE Transactions on Knowledge and Data Engineering.

[9]  Sami Virpioja BIRCH: Balanced Iterative Reducing and Clustering using Hierarchies , 2008 .

[10]  Graham Cormode,et al.  Conquering the Divide: Continuous Clustering of Distributed Data Streams , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[11]  Myra Spiliopoulou,et al.  MONIC: modeling and monitoring cluster transitions , 2006, KDD '06.

[12]  Philip S. Yu,et al.  A Framework for Clustering Evolving Data Streams , 2003, VLDB.

[13]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[14]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[15]  Geoff Hulten,et al.  A General Method for Scaling Up Machine Learning Algorithms and its Application to Clustering , 2001, ICML.

[16]  Charles Elkan,et al.  Scalability for clustering algorithms revisited , 2000, SKDD.

[17]  Tian Zhang,et al.  BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.