Multivariate Time Series Classification by Combining Trend-Based and Value-Based Approximations
Abstract:Multivariate time series data often have a very high dimensionality. Classifying such high dimensional data poses a challenge because a vast number of features can be extracted. Furthermore, the meaning of the normally intuitive term "similar to" needs to be precisely defined. Representing the time series data effectively is an essential task for decision-making activities such as prediction, clustering and classification. In this paper we propose a feature-based classification approach to classify real-world multivariate time series generated by drilling rig sensors in the oil and gas industry. Our approach encompasses two main phases: representation and classification. For the representation phase, we propose a novel representation of time series which combines trend-based and value-based approximations (we abbreviate it as TVA). It produces a compact representation of the time series which consists of symbolic strings that represent the trends and the values of each variable in the series. The TVA representation improves both the accuracy and the running time of the classification process by extracting a set of informative features suitable for common classifiers. For the classification phase, we propose a memory-based classifier which takes into account the antecedent results of the classification process. The inputs of the proposed classifier are the TVA features computed from the current segment, as well as the predicted class of the previous segment. Our experimental results on real-world multivariate time series show that our approach enables highly accurate and fast classification of multivariate time series.
暂无分享,去 创建一个
[1] Dimitrios Gunopulos,et al. Mining Time Series Data , 2005, Data Mining and Knowledge Discovery Handbook.
[2] Milos Hauskrecht,et al. Multivariate Time Series Classification with Temporal Abstractions , 2009, FLAIRS.
[3] Ayaka ONISHI,et al. Event Detection using Archived Smart House Sensor Data obtained using Symbolic Aggregate Approximation , 2011 .
[4] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.
[5] Lior Rokach,et al. Data Mining And Knowledge Discovery Handbook , 2005 .
[6] Li Wei,et al. Semi-supervised time series classification , 2006, KDD '06.
[7] Lior Rokach,et al. Data Mining and Knowledge Discovery Handbook, 2nd ed , 2010, Data Mining and Knowledge Discovery Handbook, 2nd ed..
[8] Eamonn J. Keogh,et al. An Enhanced Representation of Time Series Which Allows Fast and Accurate Classification, Clustering and Relevance Feedback , 1998, KDD.
[9] Gerd Kortuem,et al. Smart Sensing and Context, Second European Conference, EuroSSC 2007, Kendal, England, UK, October 23-25, 2007, Proceedings , 2007, EuroSSC.
[10] Eamonn J. Keogh,et al. HOT SAX: efficiently finding the most unusual time series subsequence , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).
[11] Milos Hauskrecht,et al. A Pattern Mining Approach for Classifying Multivariate Temporal Data , 2011, 2011 IEEE International Conference on Bioinformatics and Biomedicine.
[12] Ingo Mierswa,et al. YALE: rapid prototyping for complex data mining tasks , 2006, KDD '06.
[13] Nguyen Quoc Viet Hung,et al. Combining SAX and Piecewise Linear Approximation to Improve Similarity Search on Financial Time Series , 2007, 2007 International Symposium on Information Technology Convergence (ISITC 2007).
[14] Yannis Manolopoulos,et al. Continuous Trend-Based Classification of Streaming Time Series , 2005, ADBIS.
[15] Eamonn J. Keogh,et al. A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.
[16] Juwhan Song,et al. An Integrated Simulation Environment Which Automatically Generates and Edits Source Code for Geant4: Geant4Editor , 2007, 2007 International Symposium on Information Technology Convergence (ISITC 2007).
[17] George Roussos,et al. Escalation: Complex Event Detection in Wireless Sensor Networks , 2007, EuroSSC.