Effective Clustering of Time-Series Data Using FCM

Today, wide important advances in clustering time series have been obtained in the field of data mining. A large part of these successes are due to the novel achieves in dimensionality reduction and distance measurements of time series data. However, addressing the problem of time series clustering through conventional approach has not solved the issue completely, especially when the class label of time series are vague. In this paper, a two-level fuzzy clustering strategy is employed in order to achieve the objective. In the first level, upon dimensionality reduction by a symbolic representation, time series data are clustered in a high-level phase using the longest common subsequence as similarity measurement. Then, by utilizing an efficient method, prototypes are made based on constructed clusters and passed to the next level to be reused as initial centroids. Afterwards, a fuzzy clustering approach is utilized to justify the clusters precisely. We will present the benefits of the proposed system by implementing a real application: Credit card Transactions Clustering.

[1]  Sergei Vassilvitskii,et al.  How slow is the k-means method? , 2006, SCG '06.

[2]  Piotr Indyk,et al.  Mining the stock market (extended abstract): which measure is best? , 2000, KDD '00.

[3]  Ying Wah Teh,et al.  Incremental Clustering of Time-Series by Fuzzy Clustering , 2012, J. Inf. Sci. Eng..

[4]  Dimitrios Gunopulos,et al.  A Wavelet-Based Anytime Algorithm for K-Means Clustering of Time Series , 2003 .

[5]  Earl Cox,et al.  Fuzzy Modeling And Genetic Algorithms For Data Mining And Exploration , 2005 .

[6]  Vladimir Pavlovic,et al.  Discovering clusters in motion time-series data , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  Eyke Hüllermeier,et al.  Online clustering of parallel data streams , 2006, Data Knowl. Eng..

[8]  T. Warren Liao,et al.  Clustering of time series data - a survey , 2005, Pattern Recognit..

[9]  Kilian Stoffel,et al.  Classification Rules + Time = Temporal Rules , 2002, International Conference on Computational Science.

[10]  Xiaoming Jin,et al.  Indexing and Mining of the Local Patterns in Sequence Database , 2002, IDEAL.

[11]  Shyi-Ming Chen,et al.  Temperature prediction and TAIFEX forecasting based on automatic clustering techniques and two-factors high-order fuzzy time series , 2009, Expert Syst. Appl..

[12]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[13]  Ying Wah Teh,et al.  Clustering of large time series datasets , 2014, Intell. Data Anal..

[14]  Saeed Aghabozorgi,et al.  A New Approach to Present Prototypes in Clustering of Time Series , 2011 .

[15]  Ying Wah Teh,et al.  Stock market co-movement assessment using a three-phase clustering method , 2014, Expert Syst. Appl..

[16]  Dimitrios Gunopulos,et al.  Discovering similar multidimensional trajectories , 2002, Proceedings 18th International Conference on Data Engineering.

[17]  Eamonn J. Keogh,et al.  Iterative Deepening Dynamic Time Warping for Time Series , 2002, SDM.

[18]  Shane S. Sturrock,et al.  Time Warps, String Edits, and Macromolecules – The Theory and Practice of Sequence Comparison . David Sankoff and Joseph Kruskal. ISBN 1-57586-217-4. Price £13.95 (US$22·95). , 2000 .

[19]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[20]  Dimitrios Gunopulos,et al.  Iterative Incremental Clustering of Time Series , 2004, EDBT.

[21]  Michalis Vazirgiannis,et al.  On Clustering Validation Techniques , 2001, Journal of Intelligent Information Systems.

[22]  Tak-Chung Fu,et al.  Pattern discovery from stock time series using self-organizing maps , 2016 .

[23]  Dat Tran,et al.  Fuzzy C-Means Clustering-Based Speaker Verification , 2002, AFSS.

[24]  Andrea Vattani k-means Requires Exponentially Many Iterations Even in the Plane , 2011, Discret. Comput. Geom..

[25]  Paul R. Cohen,et al.  A Method for Clustering the Experiences of a Mobile Robot that Accords with Human Judgments , 2000, AAAI/IAAI.

[26]  津本 周作,et al.  Empirical Comparison of Clustering Methods for Long Time-Series Databases (小特集 「アクティブマイニング」および一般) , 2003 .

[27]  G. P. King,et al.  Using cluster analysis to classify time series , 1992 .

[28]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[29]  Lane M. D. Owsley,et al.  Self-organizing feature maps and hidden Markov models for machine-tool monitoring , 1997, IEEE Trans. Signal Process..

[30]  Paul S. Bradley,et al.  Refining Initial Points for K-Means Clustering , 1998, ICML.

[31]  Hui Ding,et al.  Querying and mining of time series data: experimental comparison of representations and distance measures , 2008, Proc. VLDB Endow..

[32]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[33]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[34]  Dragomir Anguelov,et al.  Mining The Stock Market : Which Measure Is Best ? , 2000 .

[35]  Tieniu Tan,et al.  Comparison of Similarity Measures for Trajectory Clustering in Outdoor Surveillance Scenes , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[36]  Georg Dorffner,et al.  Temporal pattern recognition in noisy non-stationary time series based on quantization into symbolic streams. Lessons learned from financial volatility trading. , 2000 .

[37]  M. Trivedi,et al.  Learning trajectory patterns by clustering: Experimental studies and comparative evaluation , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[38]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[39]  Vit Niennattrakul,et al.  Shape-Based Clustering for Time Series Data , 2012, PAKDD.

[40]  Michalis Vazirgiannis,et al.  Clustering algorithms and validity measures , 2001, Proceedings Thirteenth International Conference on Scientific and Statistical Database Management. SSDBM 2001.

[41]  Eamonn J. Keogh,et al.  Clustering of time-series subsequences is meaningless: implications for previous and future research , 2004, Knowledge and Information Systems.

[42]  Vincent S. Tseng,et al.  A novel two-level clustering method for time series data analysis , 2010, Expert Syst. Appl..

[43]  Chellu Chandra Sekhar,et al.  A density based method for multivariate time series clustering in kernel feature space , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).