论文信息 - An evolutionary K-means algorithm for clustering time series data

An evolutionary K-means algorithm for clustering time series data

It is well known that the K-means clustering algorithm is easy to get stuck at locally optimal points for high dimensional data. Many initialization techniques have been proposed to attack this problem, but with only limited success. We propose an evolutionary K-means algorithm to attack this problem. The proposed algorithm combines genetic algorithms and K-means algorithm together for improving the search ability of the K-means algorithm. We rearrange the clusters in crossover operation based on the distance of clustering centers to avoid generating meaningless offspring. A new genetic operator called swap is proposed to replace the traditional mutation operator for avoiding producing invalid offspring. Experiments performed on some publicly available time series data sets demonstrate the effectiveness and efficiency of the proposed algorithm.

Hui Zhang | Tu-Bao Ho | Mao-Song Lin

[1] Konstantinos Kalpakis,et al. Distance measures for effective clustering of ARIMA time-series , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[2] Paul S. Bradley,et al. Scaling Clustering Algorithms to Large Databases , 1998, KDD.

[3] C. A. Murthy,et al. In search of optimal clusters using genetic algorithms , 1996, Pattern Recognit. Lett..

[4] Ujjwal Maulik,et al. An evolutionary technique based on K-Means algorithm for optimal clustering in RN , 2002, Inf. Sci..

[5] Michael de la Maza,et al. Book review: Genetic Algorithms + Data Structures = Evolution Programs by Zbigniew Michalewicz (Springer-Verlag, 1992) , 1993 .

[6] TuBaoHo,et al. Mining Hepatitis Data with Temporal Abstraction (文部科学省科学研究費特定領域研究「情報洪水時代におけるアクティブマイニングの実現」公開シンポジウム) , 2003 .

[7] Tim Oates,et al. PERUSE: An unsupervised algorithm for finding recurring patterns in time series , 2002, 2002 IEEE International Conference on Data Mining, 2002. Proceedings..

[8] Paul S. Bradley,et al. Refining Initial Points for K-Means Clustering , 1998, ICML.