A New Multivariate Time Series Transformation Technique Using Closed Interesting Subspaces

Subspace clustering detects the clusters that are existing in the subspaces of the feature space. Density based subspace clustering defines clusters as regions of high density existing in subspaces of multidimensional datasets. This paper discusses the concept of closed interesting subspaces under density divergence context for multivariate datasets and proposes an algorithm to transform the multivariate time series to a symbol sequence using the closed interesting subspaces. The proposed transformation allows the applicability of any of the symbolic sequential mining algorithms to efficiently extract sequential patterns which capture the interdependencies and co-variations among groups of time series variables. The multivariate time series transformation technique is explained using a sample dataset. It is evaluated using a real world weather dataset obtained from Cambridge University. The representation power of the closed interesting subspaces and maximal interesting subspaces in transforming multivariate time series is compared.

[1]  Mohammed J. Zaki,et al.  SCHISM: a new approach for interesting subspace mining , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[2]  M. Shashi,et al.  Mining Closed Interesting Subspaces to Discover Conducive Living Environment of Migratory Animals , 2015, FICTA.

[3]  Gerhard Thonhauser,et al.  Multivariate Time Series Classification by Combining Trend-Based and Value-Based Approximations , 2012, ICCSA.

[4]  Jonathan Goldstein,et al.  When Is ''Nearest Neighbor'' Meaningful? , 1999, ICDT.

[5]  Ming-Syan Chen,et al.  Density Conscious Subspace Clustering for High-Dimensional Data , 2010, IEEE Transactions on Knowledge and Data Engineering.

[6]  Mohammed J. Zaki,et al.  SCHISM: a new approach to interesting subspace mining , 2005, Int. J. Bus. Intell. Data Min..

[7]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[8]  Andrew K. C. Wong,et al.  Discovery of Temporal Associations in Multivariate Time Series , 2014, IEEE Transactions on Knowledge and Data Engineering.

[9]  Dimitrios Gunopulos,et al.  Mining Time Series Data , 2005, Data Mining and Knowledge Discovery Handbook.

[10]  Kuniaki Uehara,et al.  Discovery of Time-Series Motif from Multi-Dimensional Data Based on MDL Principle , 2005, Machine Learning.

[11]  Chen Yu,et al.  Spatio-Temporal Symbolization of Multidimensional Time Series , 2010, 2010 IEEE International Conference on Data Mining Workshops.

[12]  Huan Liu,et al.  Subspace clustering for high dimensional data: a review , 2004, SKDD.