A New Pattern Representation Method for Time-Series Data

The rapid growth of Internet of Things (IoT) and sensing technologies has led to an increasing interest in time-series data analysis. In many domains, detecting patterns of IoT data and interpreting these patterns are challenging issues. There are several methods in time-series analysis that deal with issues such as volume and velocity of IoT data streams. However, analysing the content of the data streams and extracting insights from dynamic IoT data is still a challenging task. In this paper, we propose a pattern representation method which represents time-series frames as vectors by first applying Piecewise Aggregate Approximation (PAA) and then applying Lagrangian Multipliers. This method allows representing continuous data as a series of patterns that can be used and processed by various higher-level methods. We introduce a new change point detection method which uses the constructed patterns in its analysis. We evaluate and compare our representation method with Blocks of Eigenvalues Algorithm (BEATS) and Symbolic Aggregate approXimation (SAX) methods to cluster various datasets. We have also evaluated our proposed change detection method. We have evaluated our algorithm using UCR time-series datasets and also a healthcare dataset. The evaluation results show significant improvements in analysing time-series data in our proposed method.

[1]  Tak-Chung Fu,et al.  A review on time series data mining , 2011, Eng. Appl. Artif. Intell..

[2]  Ahmed Zoha,et al.  Health management and pattern analysis of daily living activities of people with dementia using in-home sensors and machine learning techniques , 2018, PloS one.

[3]  Eamonn J. Keogh,et al.  Experimental comparison of representation methods and distance measures for time series data , 2010, Data Mining and Knowledge Discovery.

[4]  Marina Papatriantafilou,et al.  Streaming piecewise linear approximation for efficient data management in edge computing , 2019, SAC.

[5]  Eamonn J. Keogh,et al.  An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.

[6]  Luke M. Davis,et al.  Predictive modelling of bone ageing , 2013 .

[7]  Alan C. Bovik,et al.  The Essential Guide to Image Processing , 2009, J. Electronic Imaging.

[8]  Romain Briandet,et al.  Discrimination of Arabica and Robusta in Instant Coffee by Fourier Transform Infrared Spectroscopy and Chemometrics , 1996 .

[9]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[10]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[11]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[12]  Hagit Shatkay,et al.  Approximate queries and representations for large data sequences , 1996, Proceedings of the Twelfth International Conference on Data Engineering.

[13]  Rahim Tafazolli,et al.  Large-Scale Indexing, Discovery, and Ranking for the Internet of Things (IoT) , 2018, ACM Comput. Surv..

[14]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[15]  Theodosios Pavlidis,et al.  Segmentation of Plane Curves , 1974, IEEE Transactions on Computers.

[16]  Marina Papatriantafilou,et al.  DRIVEN: a Framework for Efficient Data Retrieval and Clustering in Vehicular Networks , 2019, 2019 IEEE 35th International Conference on Data Engineering (ICDE).

[17]  Eamonn J. Keogh,et al.  Time series shapelets: a new primitive for data mining , 2009, KDD.

[18]  Andrew L. Rukhin,et al.  Analysis of Time Series Structure SSA and Related Techniques , 2002, Technometrics.

[19]  Nigel Collier,et al.  Change-Point Detection in Time-Series Data by Relative Density-Ratio Estimation , 2012, Neural Networks.

[20]  Sang Joon Kim,et al.  A Mathematical Theory of Communication , 2006 .

[21]  Hayato Yamana,et al.  An improved symbolic aggregate approximation distance measure based on its statistical features , 2016, iiWAS.

[22]  Anatoly A. Zhigljavsky,et al.  Singular Spectrum Analysis for Time Series , 2013, International Encyclopedia of Statistical Science.

[23]  Diane J. Cook,et al.  A survey of methods for time series change point detection , 2017, Knowledge and Information Systems.

[24]  R. Larsen An introduction to mathematical statistics and its applications / Richard J. Larsen, Morris L. Marx , 1986 .

[25]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[26]  Tran Khanh Dang,et al.  Two Novel Adaptive Symbolic Representations for Similarity Search in Time Series Databases , 2010, 2010 12th International Asia-Pacific Web Conference.

[27]  Ambuj K. Singh,et al.  Efficient retrieval for browsing large image databases , 1996, CIKM '96.

[28]  Liu Liu,et al.  Adaptive nonparametric CUSUM scheme for detecting unknown shifts in location , 2014 .

[29]  Ingrid Daubechies,et al.  Ten Lectures on Wavelets , 1992 .

[30]  Aurora González-Vidal,et al.  BEATS: Blocks of Eigenvalues Algorithm for Time Series Segmentation , 2018, IEEE Transactions on Knowledge and Data Engineering.

[31]  Clu-istos Foutsos,et al.  Fast subsequence matching in time-series databases , 1994, SIGMOD '94.

[32]  Joydeep Ghosh,et al.  Cluster Ensembles --- A Knowledge Reuse Framework for Combining Multiple Partitions , 2002, J. Mach. Learn. Res..

[33]  Jiuyong Li,et al.  An improvement of symbolic aggregate approximation distance measure for time series , 2014, Neurocomputing.