Mitigating the influence of the curse of dimensionality on time series similarity measures

Time series are ubiquitous application domains that generate data including GPS, stock market, and ECG. Researchers concentrate on mining time series data to extract important knowledge and insights. Time series similarity search is a data mining technique that is widely used to compare time series data using similarity measurements, such as dynamic timewarping and Euclidean distance. The large number of sequences dimensions makes the mining process costly. Therefore, we need to extract fewer representative points, hence making the mining process manageable. In this paper, we investigate the application of three dimensionality reduction techniques random projection, downsampling and averaging on time series similarity search. Our study has been conducted based on very exhaustive experiments. Results show the performance of the reduction techniques on two similarity measures. Simulation shows that a high similarity matching accuracy can still be achieved after the reduction onto lower dimensions.

[1]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[2]  Setsuya Kurahashi,et al.  Personal classification space-based collaborative filtering algorithms , 2013, Int. J. Comput. Appl. Technol..

[3]  Wesley W. Chu,et al.  An index-based approach for similarity search supporting time warping in large sequence databases , 2001, Proceedings 17th International Conference on Data Engineering.

[4]  Dieter Pfoser,et al.  Novel Approaches in Query Processing for Moving Object Trajectories , 2000, VLDB 2000.

[5]  Yufei Tao,et al.  MV3R-Tree: A Spatio-Temporal Access Method for Timestamp and Interval Queries , 2001, VLDB.

[6]  Clement T. Yu,et al.  Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping , 2003, IEEE Trans. Knowl. Data Eng..

[7]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[8]  Heikki Mannila,et al.  Random projection in dimensionality reduction: applications to image and text data , 2001, KDD '01.

[9]  Zhihua Cai,et al.  Hybrid dynamic k-nearest-neighbour and distance and attribute weighted method for classification , 2012, Int. J. Comput. Appl. Technol..

[10]  Setsuya Kurahashi Technology extraction from time series data reflecting expert operator skills and knowledge , 2008, Int. J. Comput. Appl. Technol..

[11]  Yahya Chetouani A non-linear auto-regressive moving average with exogenous input non-linear modelling and fault detection using the cumulative sum (Page-Hinkley) test: application to a reactor , 2008, Int. J. Comput. Appl. Technol..

[12]  Eamonn J. Keogh,et al.  Atomic wedgie: efficient query filtering for streaming time series , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[13]  W. Chu,et al.  Fast retrieval of similar subsequences in long sequence databases , 1999, Proceedings 1999 Workshop on Knowledge and Data Engineering Exchange (KDEX'99) (Cat. No.PR00453).

[14]  Robert F. Ling,et al.  Classification and Clustering. , 1979 .

[15]  Joachim Gudmundsson,et al.  Dimensionality reduction for long duration and complex spatio-temporal queries , 2007, SAC '07.

[16]  Dimitris Achlioptas,et al.  Database-friendly random projections , 2001, PODS.

[17]  W. B. Johnson,et al.  Extensions of Lipschitz mappings into Hilbert space , 1984 .

[18]  Ryutarou Ohbuchi,et al.  Shape-similarity search of 3D models by using enhanced shape functions , 2003, Proceedings of Theory and Practice of Computer Graphics, 2003..

[19]  Stan Salvador,et al.  FastDTW: Toward Accurate Dynamic Time Warping in Linear Time and Space , 2004 .

[20]  Kah Phooi Seng,et al.  Performance comparison of data compression algorithms for environmental monitoring wireless sensor networks , 2013, Int. J. Comput. Appl. Technol..

[21]  Xiaoyang Zeng,et al.  An Ultra-Low Power QRS Complex Detection Algorithm Based on Down-Sampling Wavelet Transform , 2013, IEEE Signal Processing Letters.

[22]  Philip Chan,et al.  Toward accurate dynamic time warping in linear time and space , 2007, Intell. Data Anal..

[23]  Dimitris Achlioptas,et al.  Database-friendly random projections: Johnson-Lindenstrauss with binary coins , 2003, J. Comput. Syst. Sci..

[24]  Li Dong,et al.  Adaptive downsampling to improve image compression at low bit rates , 2006, IEEE Transactions on Image Processing.

[25]  Oskar Söderkvist,et al.  Computer Vision Classification of Leaves from Swedish Trees , 2001 .

[26]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[27]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.