Continuous Trend-Based Classification of Streaming Time Series

Trend analysis of time series data is an important research direction. In streaming time series the problem is more challenging, taking into account the fact that new values arrive for the series, probably in very high rates. Therefore, effective and efficient methods are required in order to classify a streaming time series based on its trend. Since new values are continuously arrive for each stream, the classification is performed by means of a sliding window which focuses on the last values of each stream. Each streaming time series is transformed to a vector by means of a Piecewise Linear Approximation (PLA) technique. The PLA vector is a sequence of symbols denoting the trend of the series (either UP or DOWN), and it is constructed incrementally. Efficient in-memory methods are used in order to: 1) determine the class of each streaming time series and 2) determine the streaming time series that comprise a specific trend class. Performance evaluation based on real-life datasets is performed, which shows the efficiency of the proposed approach both with respect to classification time and storage requirements. The proposed method can be used in order to continuously classify a set of streaming time series according to their trends, to monitor the behavior of a set of streams and to monitor the contents of a set of trend classes.

[1]  Sudipto Guha,et al.  Clustering Data Streams , 2000, FOCS.

[2]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[3]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[4]  Piotr Indyk,et al.  Maintaining stream statistics over sliding windows: (extended abstract) , 2002, SODA '02.

[5]  Geoff Hulten,et al.  Mining high-speed data streams , 2000, KDD '00.

[6]  Piotr Indyk,et al.  Maintaining Stream Statistics over Sliding Windows , 2002, SIAM J. Comput..

[7]  Philip S. Yu,et al.  On demand classification of data streams , 2004, KDD.

[8]  Jennifer Widom,et al.  Models and issues in data stream systems , 2002, PODS.

[9]  Yixin Luo,et al.  Bitmap approach to trend clustering for prediction in time series databases , 2001, SPIE Defense + Commercial Sensing.

[10]  Peter Ljubi,et al.  TIME-SERIES ANALYSIS OF UK TRAFFIC ACCIDENT DATA , 2002 .

[11]  Eamonn J. Keogh,et al.  A Simple Dimensionality Reduction Technique for Fast Similarity Search in Large Time Series Databases , 2000, PAKDD.

[12]  Eamonn J. Keogh,et al.  An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.

[13]  Geoff Hulten,et al.  Mining time-changing data streams , 2001, KDD '01.

[14]  Sudipto Guha,et al.  Clustering Data Streams: Theory and Practice , 2003, IEEE Trans. Knowl. Data Eng..

[15]  Wai Lam,et al.  News Sensitive Stock Trend Prediction , 2002, PAKDD.

[16]  LastMark Online classification of nonstationary data streams , 2002 .

[17]  Mark Last,et al.  Online classification of nonstationary data streams , 2002, Intell. Data Anal..

[18]  Donghui Zhang,et al.  Online event-driven subsequence matching over financial data streams , 2004, SIGMOD '04.

[19]  Toshio Hirotsu,et al.  Proximity mining: finding proximity using sensor data history , 2003, 2003 Proceedings Fifth IEEE Workshop on Mobile Computing Systems and Applications.