Piecewise two-dimensional normal cloud representation for time-series data mining

Many high-level dimensionality reduction approaches for mining time series have been proposed, e.g., SAX, PWCA , and Feature-based. Due to the rapid performance degradation of time-series data mining in much lower dimensionality and the continuously increasing amount of time series data with uncertainty, there remains a burning need to develop new time-series representations that can retain good performance in much lower reduced space and address uncertainty efficiently. In this work, we propose a novel time series representation, namely Two-dimensional Normal Cloud Representation (2D-NCR), based on cloud model theory. The representation achieves dimensionality reduction by transforming the raw time series into a sequence of two-dimensional normal cloud models. Moreover, a new similarity measure between the transformed time series is presented. The proposed method can reflect the characteristic data distribution of the time series and capture the variation with time. We validate the performance of our representation on the various data mining tasks of classification, clustering, and query by content. The experimental results demonstrate that 2D-NCR is an effective and competitive representation for time-series data mining.

[1]  Christos Faloutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[2]  Carlos Agón,et al.  Time-series data mining , 2012, CSUR.

[3]  Witold Pedrycz,et al.  Clustering Granular Data and Their Characterization With Information Granules of Higher Type , 2015, IEEE Transactions on Fuzzy Systems.

[4]  Jonathan M. Garibaldi,et al.  Context-Dependent Fuzzy Systems With Application to Time-Series Prediction , 2014, IEEE Transactions on Fuzzy Systems.

[5]  Guoyin Wang,et al.  A multi-granularity combined prediction model based on fuzzy trend forecasting and particle swarm techniques , 2016, Neurocomputing.

[6]  Heng Wang,et al.  Locality Statistics for Anomaly Detection in Time Series of Graphs , 2013, IEEE Transactions on Signal Processing.

[7]  Kai Xu,et al.  Image segmentation based on histogram analysis utilizing the cloud model , 2011, Comput. Math. Appl..

[8]  Xizhao Wang,et al.  Segmenting time series with connected lines under maximum error bound , 2016, Inf. Sci..

[9]  Roberto Rosas-Romero,et al.  Forecasting of stock return prices with sparse representation of financial time series over redundant dictionaries , 2016, Expert Syst. Appl..

[10]  Guoyin Wang,et al.  Multi-granularity Intelligent Information Processing , 2015, RSFDGrC.

[11]  Li Wei,et al.  Experiencing SAX: a novel symbolic representation of time series , 2007, Data Mining and Knowledge Discovery.

[12]  Xuerui Zhang,et al.  A novel hybrid water quality time series prediction method based on cloud model and fuzzy forecasting , 2015 .

[13]  Qinghua Hu,et al.  Kernel sparse representation for time series classification , 2015, Inf. Sci..

[14]  Liang Zhao,et al.  Time series clustering via community detection in networks , 2015, Inf. Sci..

[15]  Deyi Li,et al.  Artificial Intelligence with Uncertainty , 2004, CIT.

[16]  Eamonn J. Keogh,et al.  Rare Time Series Motif Discovery from Unbounded Streams , 2014, Proc. VLDB Endow..

[17]  Jiuyong Li,et al.  An improvement of symbolic aggregate approximation distance measure for time series , 2014, Neurocomputing.

[18]  Shengfa Miao,et al.  Predefined pattern detection in large time series , 2016, Inf. Sci..

[19]  Chenxi Shao,et al.  A non-parametric symbolic approximate representation for long time series , 2014, Pattern Analysis and Applications.

[20]  Pierpaolo D'Urso,et al.  Fuzzy Clustering for Data Time Arrays With Inlier and Outlier Time Trajectories , 2005, IEEE Transactions on Fuzzy Systems.

[21]  Yang Zhao,et al.  PLANAR MODEL AND ITS APPLICATION IN PREDICTION , 1998 .

[22]  Hagit Shatkay,et al.  Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[23]  Lars Schmidt-Thieme,et al.  Scalable Classification of Repetitive Time Series Through Frequencies of Local Polynomials , 2015, IEEE Transactions on Knowledge and Data Engineering.

[24]  Pierpaolo D'Urso,et al.  Fuzzy unsupervised classification of multivariate time trajectories with the Shannon entropy regularization , 2006, Comput. Stat. Data Anal..

[25]  Li De,et al.  Artificial Intelligence with Uncertainty , 2004 .

[26]  Bernhard Sick,et al.  Online Segmentation of Time Series Based on Polynomial Least-Squares Approximations , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Nick S. Jones,et al.  Highly Comparative Feature-Based Time-Series Classification , 2014, IEEE Transactions on Knowledge and Data Engineering.

[28]  George C. Runger,et al.  Time series representation and similarity based on local autopatterns , 2016, Data Mining and Knowledge Discovery.

[29]  Juan Ares Casal,et al.  A soft computing framework for classifying time series based on fuzzy sets of events , 2016, Inf. Sci..

[30]  Hao Wang,et al.  Durable Queries over Historical Time Series , 2014, IEEE Transactions on Knowledge and Data Engineering.

[31]  Maciej Krawczak,et al.  An approach to dimensionality reduction in time series , 2014, Inf. Sci..

[32]  Sergio Greco,et al.  A time series representation model for accurate and fast similarity detection , 2009, Pattern Recognit..

[33]  Xu Chang Backward Cloud Transformation Algorithm for Realizing Stability Bidirectional Cognitive Mapping , 2013 .

[34]  Guoyin Wang,et al.  Generic normal cloud model , 2014, Inf. Sci..

[35]  Hamido Fujita,et al.  A novel forecasting method based on multi-order fuzzy time series and technical analysis , 2016, Inf. Sci..

[36]  Pierpaolo D'Urso,et al.  Dissimilarity measures for time trajectories , 2000 .

[37]  S. Takeuchi An interpretation of 111 line type slip behaviour in B2 compounds in terms of the Peierls mechanism of a screw dislocation , 1980 .

[38]  Ambuj K. Singh,et al.  Dimensionality reduction for similarity searching in dynamic databases , 1998, SIGMOD '98.

[39]  Eamonn J. Keogh,et al.  Experimental comparison of representation methods and distance measures for time series data , 2010, Data Mining and Knowledge Discovery.

[40]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[41]  Chonghui Guo,et al.  Piecewise cloud approximation for time series mining , 2011, Knowl. Based Syst..

[42]  George C. Runger,et al.  A time series forest for classification and feature extraction , 2013, Inf. Sci..

[43]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).