New Time Series Data Representation ESAX for Financial Applications
Efficient and accurate similarity searching for a large amount of time series data set is an important but non-trivial problem. Many dimensionality reduction techniques have been proposed for effective representation of time series data in order to realize such similarity searching, including Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), the Adaptive Piecewise Constant Approximation (APCA), and the recently proposed Symbolic Aggregate Approximation (SAX).
Extended SAX: Extension of Symbolic Aggregate Approximation for Financial Time Series Data Representation
Efficient and accurate similarity searching for a large amount of time series data set is an important but non-trivial problem. Many dimensionality reduction techniques have been proposed for effective representation of time series data in order to realize such similarity searching, including Singular Value Decomposition (SVD), the Discrete Fourier transform (DFT), the Adaptive Piecewise Constant Approximation (APCA), and the recently proposed Symbolic Aggregate Approximation (SAX). In this work we propose a new extended approach based on SAX, called Extended SAX in order to realize efficient and accurate discovering of important patterns, necessary for financial applications. While the original SAX approach allows a very good dimensionality reduction and distance measures to be defined on the symbolic approach, SAX is based on PAA (Piecewise Aggregate Approximation) representation for dimensionality reduction that minimizes dimensionality by the mean values of equal sized frames. This value based representation causes a high possibility to miss some important patterns in some time series data such as financial time series data. Extended SAX, proposed in the paper, uses additional two new points, that is, max and min points, in equal sized frames besides the mean value for data approximation. We show that Extended SAX can improve representation preciseness without losing symbolic nature of the original SAX representation. We empirically compare the Extended SAX with the original SAX approach and demonstrate its quality improvement.
neural network sensor network machine learning artificial neural network support vector machine deep learning time series data mining support vector vector machine wavelet transform data analysi deep neural network neural network model hidden markov model regression model deep neural anomaly detection gene expression data base generative adversarial network generative adversarial time series datum adversarial network experimental datum fourier series nearest neighbor support vector regression time series analysi missing datum data based moving average gene expression datum time series model series analysi lyapunov exponent series datum outlier detection dynamic time warping time series forecasting data mining algorithm panel datum time series prediction series model multivariate time series finite time unit root dynamic time linear and nonlinear series forecasting time warping distance measure financial time series series prediction integrated moving average experimental comparison multivariate time financial time dependent variable chaotic time series nonlinear time vegetation index nonlinear time series arima model fuzzy time large time anomaly detection method fuzzy time series chaotic time autoregressive integrated moving time series based air pollutant time series classification representation method fokker-planck equation series representation similarity analysi series classification univariate time series time series clustering unsupervised anomaly detection periodic pattern nearest neighbor classification time series dataset series data mining time series regression anomaly detection approach time series database series clustering observed time series forecasting time series local similarity long time series time series similarity series database fmri time series complex time indian stock market time series representation symbolic aggregate approximation complex time series forecasting time series data set series similarity fmri time time series anomaly large time series series data analysi series anomaly detection analyzing time series expression time series interrupted time series ucr time series time correction modeling time series clustering time series mining time series interrupted time series data based fourier series representation simple exponential smoothing early classification forecast time series time series subsequence sensor networks pose distributed index piecewise constant approximation quality time series mining time microarray time series incomplete time series massive time series large-scale time series analysing time series microarray time neural time series mri time neural time series data generated time series experiment visualizing time series called time series data set