Clustering Multiple Data Streams

In recent years, data streams analysis has gained a lot of attention due to the growth of applicative fields generating huge amount of temporal data. In this paper we will focus on the clustering of multiple streams. We propose a new strategy which aims at grouping similar streams and, together, at computing summaries of the incoming data. This is performed by means of a divide and conquer approach where a continuously updated graph collects information on incoming data and an off-line partitioning algorithm provides the final clustering structure. An application on real data sets corroborates the effectiveness of the proposal.

[1]  Wolfgang Gaul,et al.  "Classification, Clustering, and Data Mining Applications" , 2004 .

[2]  Ming-Syan Chen,et al.  Clustering over Multiple Evolving Streams by Events and Correlations , 2007, IEEE Transactions on Knowledge and Data Engineering.

[3]  Jiong Yang Dynamic clustering of evolving streams with a single pass , 2003, Proceedings 19th International Conference on Data Engineering (Cat. No.03CH37405).

[4]  Ulrike von Luxburg,et al.  A tutorial on spectral clustering , 2007, Stat. Comput..

[5]  Ming-Syan Chen,et al.  Adaptive Clustering for Multiple Evolving Streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[6]  Eyke Hüllermeier,et al.  Online clustering of parallel data streams , 2006, Data Knowl. Eng..

[7]  Israël-César Lerman,et al.  REVUE DE STATISTIQUE APPLIQUÉE , 1987 .

[8]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Sharanjit Kaur,et al.  Exclusive and Complete Clustering of Streams , 2007, DEXA.

[10]  Rosanna Verde,et al.  Clustering Methods in Symbolic Data Analysis , 2004 .

[11]  François Bavaud Spectral Clustering and Multidimensional Scaling: A Unified View , 2006, Data Science and Classification.

[12]  E. Diday Une nouvelle méthode en classification automatique et reconnaissance des formes la méthode des nuées dynamiques , 1971 .

[13]  Kitsana Waiyamai,et al.  E-Stream: Evolution-Based Technique for Stream Clustering , 2007, ADMA.