Clustering Data Streams by On-Line Proximity Updating

In this paper, we introduce a new clustering strategy for temporally ordered data streams, which is able to discover groups of homogeneous streams performing a single pass on data. It is a two steps approach where an on-line algorithm computes statistics about the dissimilarities among data and then, an off-line algorithm computes the final partition of the streams. The effectiveness of the proposal is evaluated through tests on real data.

[1]  Mohamed Medhat Gaber,et al.  Knowledge Discovery from Sensor Data , 2008 .

[2]  Anne M. Denton,et al.  Clustering of Time Series Data , 2009, Encyclopedia of Data Warehousing and Mining.

[3]  Yves Lechevallier,et al.  Clustering Multiple Data Streams , 2011 .

[4]  Ming-Syan Chen,et al.  Adaptive Clustering for Multiple Evolving Streams , 2006, IEEE Transactions on Knowledge and Data Engineering.

[5]  Monique Noirhomme-Fraiture,et al.  Symbolic Data Analysis and the SODAS Software , 2008 .

[6]  Salvatore Ingrassia,et al.  New perspectives in statistical modeling and data analysis: proceedings of the 7th Conference of the Classification and data analysis group of the Italian statistical Society, Catania, September 9 - 11, 2009 , 2011 .

[7]  Rosanna Verde,et al.  Clustering Methods in Symbolic Data Analysis , 2004 .

[8]  João Gama,et al.  Hierarchical Clustering of Time-Series Data Streams , 2008, IEEE Transactions on Knowledge and Data Engineering.

[9]  V. Kavitha,et al.  Clustering Time Series Data Stream - A Literature Survey , 2010, ArXiv.

[10]  Eyke Hüllermeier,et al.  Online clustering of parallel data streams , 2006, Data Knowl. Eng..

[11]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[12]  Israël-César Lerman,et al.  REVUE DE STATISTIQUE APPLIQUÉE , 1987 .

[13]  Ujjwal Maulik,et al.  Performance Evaluation of Some Clustering Algorithms and Validity Indices , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  E. Diday Une nouvelle méthode en classification automatique et reconnaissance des formes la méthode des nuées dynamiques , 1971 .