NSPRING: Normalization-supported SPRING for subsequence matching on time series streams

Mining sequences and patterns in time series data streams have a tremendous growth of interest in todays world. The rapid progress of data collection and the web technologies yield tremendous growth of flowing data in various complex forms that need to be analyzed on fly. Traditional data mining methods typically that require to process data by scanning it multiple times are infeasible for stream data applications. However, new techniques like SPRING attempts to solve these challenges by identifying sequences of patterns on time series streams, whose time and space complexity are linear. Unfortunately, SPRING does not support normalization. As many researchers accepted that normalization is necessary, so SPRING is not applicable for most data sets. In this paper, we are proposing an approach called NSPRING based on SPRING. NSPRING extends the advantages of SPRING, e.g. low in time and space complexity, while it can support normalization. More interestingly, NSPRING retains similar mining accuracy to SPRING.

[1]  Eamonn J. Keogh,et al.  Everything you know about Dynamic Time Warping is Wrong , 2004 .

[2]  Man Hon Wong,et al.  Efficient subsequence matching for sequences databases under time warping , 2003, Seventh International Database Engineering and Applications Symposium, 2003. Proceedings..

[3]  George M. Church,et al.  Aligning gene expression time series with time warping algorithms , 2001, Bioinform..

[4]  Eamonn J. Keogh,et al.  On the Need for Time Series Data Mining Benchmarks: A Survey and Empirical Demonstration , 2002, Data Mining and Knowledge Discovery.

[5]  Pavlos Protopapas,et al.  Supporting exact indexing of arbitrarily rotated shapes and periodic time series under Euclidean and warping distance measures , 2008, The VLDB Journal.

[6]  Liang Su,et al.  Fast similarity matching on data stream with noise , 2008, ICDE Workshops.

[7]  F. Itakura,et al.  Minimum prediction residual principle applied to speech recognition , 1975 .

[8]  Vit Niennattrakul,et al.  Accurate subsequence matching on data stream under time warping distance , 2009, 2009 6th International Conference on Electrical Engineering/Electronics, Computer, Telecommunications and Information Technology.

[9]  Christos Faloutsos,et al.  Stream Monitoring under the Time Warping Distance , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[10]  S. Chiba,et al.  Dynamic programming algorithm optimization for spoken word recognition , 1978 .

[11]  Mi Zhou,et al.  A geometrical solution to time series searching invariant to shifting and scaling , 2005, Knowledge and Information Systems.

[12]  Eamonn J. Keogh,et al.  Exact indexing of dynamic time warping , 2002, Knowledge and Information Systems.

[13]  Mi Zhou,et al.  Efficient Online Subsequence Searching in Data Streams under Dynamic Time Warping Distance , 2008, 2008 IEEE 24th International Conference on Data Engineering.

[14]  Vit Niennattrakul,et al.  Efficient Subsequence Search on Streaming Data Based on Time Warping Distance , 1970 .

[15]  Eamonn J. Keogh,et al.  Accelerating Dynamic Time Warping Subsequence Search with GPUs and FPGAs , 2010, 2010 IEEE International Conference on Data Mining.

[16]  Christos Faloutsos,et al.  Efficient retrieval of similar time sequences under time warping , 1998, Proceedings 14th International Conference on Data Engineering.

[17]  Eamonn Keogh Exact Indexing of Dynamic Time Warping , 2002, VLDB.

[18]  Dennis Shasha,et al.  Warping indexes with envelope transforms for query by humming , 2003, SIGMOD '03.

[19]  M. Wong,et al.  A geometrical solution to time series searching invariant to shifting and scaling , 2006 .

[20]  Dimitrios Gunopulos,et al.  Embedding-based subsequence matching in time-series databases , 2011, TODS.

[21]  Man Hon Wong,et al.  Fast time-series searching with scaling and shifting , 1999, PODS '99.

[22]  Eamonn J. Keogh,et al.  Searching and Mining Trillions of Time Series Subsequences under Dynamic Time Warping , 2012, KDD.