Sparse Modeling of Volatile Financial Time Series Via Low-Dimensional Patterns Over Learned Dictionaries

Financial time series usually exhibit non-stationarity and time-varying volatility. Extraction and analysis of complicated patterns, such as trends and transient changes, are at the core of modern financial data analytics. Furthermore, efficient and timely analysis is often hindered by large volumes of raw data, which are supplied and stored nowadays. In this paper, the power of learned dictionaries in adapting accurately to the underlying micro-local structures of time series is exploited to extract sparse patterns, aiming at compactly capturing the meaningful information of volatile financial data. Specifically, our proposed method relies on sparse representations of the original time series in terms of dictionary atoms, which are learned and updated from the available data directly in a rolling-window fashion. In contrast to previous methods, our extracted sparse patterns enable both compact storage and highly accurate reconstruction of the original data. Equally importantly, financial analytics, such as volatility clustering, can be performed on the sparse patterns directly, thus reducing the overall computational cost, without deteriorating accuracy. Experimental evaluation on 12 market indexes reveals a superior performance of our approach against a modified symbolic representation and a well-established wavelet transform-based technique, in terms of information compactness, reconstruction accuracy, and volatility clustering efficiency.

[1]  Eamonn J. Keogh,et al.  Segmenting Time Series: A Survey and Novel Approach , 2002 .

[2]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[3]  Ronald L. Rivest,et al.  Introduction to Algorithms, 3rd Edition , 2009 .

[4]  Salim Lahmiri,et al.  A Supervised Classification System of Financial Data Based on Wavelet Packet and Neural Networks , 2013, Int. J. Strateg. Decis. Sci..

[5]  Bernhard Schölkopf,et al.  Nonlinear Component Analysis as a Kernel Eigenvalue Problem , 1998, Neural Computation.

[6]  Kyoji Kawagoe,et al.  Extended SAX: Extension of Symbolic Aggregate Approximation for Financial Time Series Data Representation , 2006 .

[7]  Michel Verleysen,et al.  Nonlinear Dimensionality Reduction , 2021, Computer Vision.

[8]  Jianqing Fan,et al.  High Dimensional Classification Using Features Annealed Independence Rules. , 2007, Annals of statistics.

[9]  Jiansheng Chen,et al.  Palmprint Authentication Using Time Series , 2005, AVBPA.

[10]  Christos Faloutsos,et al.  Efficient Similarity Search In Sequence Databases , 1993, FODO.

[11]  Jianqing Fan,et al.  A Selective Overview of Variable Selection in High Dimensional Feature Space. , 2009, Statistica Sinica.

[12]  Eamonn J. Keogh,et al.  Dimensionality Reduction for Fast Similarity Search in Large Time Series Databases , 2001, Knowledge and Information Systems.

[13]  Eamonn J. Keogh,et al.  A symbolic representation of time series, with implications for streaming algorithms , 2003, DMKD '03.

[14]  Lie Wang,et al.  Orthogonal Matching Pursuit for Sparse Signal Recovery With Noise , 2011, IEEE Transactions on Information Theory.

[15]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[16]  Divyakant Agrawal,et al.  A comparison of DFT and DWT based similarity search in time-series databases , 2000, CIKM '00.

[17]  Rui Li,et al.  Simultaneous Learning of Nonlinear Manifold and Dynamical Models for High-dimensional Time Series , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[18]  Eamonn J. Keogh,et al.  Locally adaptive dimensionality reduction for indexing large time series databases , 2001, SIGMOD '01.

[19]  Mikhail Belkin,et al.  Laplacian Eigenmaps and Spectral Techniques for Embedding and Clustering , 2001, NIPS.

[20]  C. Sims,et al.  Vector Autoregressions , 1999 .

[21]  Suman Nath,et al.  Cypress : Managing Massive Time Series Streams with Multi-Scale Compressed Trickles , 2009 .

[22]  Karl Aberer,et al.  Symbolic representation of smart meter data , 2013, EDBT '13.

[23]  Alberto O. Mendelzon,et al.  Efficient Retrieval of Similar Time Sequences Using DFT , 1998, FODO.

[24]  Ambuj K. Singh,et al.  Efficient retrieval for browsing large image databases , 1996, CIKM '96.

[25]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[26]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[27]  Ada Wai-Chee Fu,et al.  Efficient time series matching by wavelets , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[28]  David A. Landgrebe,et al.  Supervised classification in high-dimensional space: geometrical, statistical, and asymptotical properties of multivariate data , 1998, IEEE Trans. Syst. Man Cybern. Part C.

[29]  Clement T. Yu,et al.  Haar Wavelets for Efficient Similarity Search of Time-Series: With and Without Time Warping , 2003, IEEE Trans. Knowl. Data Eng..

[30]  Abraham Lempel,et al.  A universal algorithm for sequential data compression , 1977, IEEE Trans. Inf. Theory.

[31]  Ambuj K. Singh,et al.  Variable length queries for time series data , 2001, Proceedings 17th International Conference on Data Engineering.

[32]  Fl Chung,et al.  Financial time series indexing based on low resolution clustering , 2004 .

[33]  Qiwei Yao,et al.  Large Volatility Matrix Inference via Combining Low-Frequency and High-Frequency Approaches , 2011 .

[34]  J. Murphy Technical Analysis of the Financial Markets , 1999 .

[35]  Ronald L. Rivest,et al.  Introduction to Algorithms, third edition , 2009 .

[36]  Michael Elad,et al.  Sparse and Redundant Representations - From Theory to Applications in Signal and Image Processing , 2010 .

[37]  Michael Elad,et al.  From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images , 2009, SIAM Rev..

[38]  Maja J. Mataric,et al.  A spatio-temporal extension to Isomap nonlinear dimension reduction , 2004, ICML.

[39]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[40]  Christos Faloutsos,et al.  Fast Time Sequence Indexing for Arbitrary Lp Norms , 2000, VLDB.

[41]  P. Massart,et al.  From Model Selection to Adaptive Estimation , 1997 .

[42]  Tugba Taskaya-Temizel,et al.  Summarizing Time Series: Learning Patterns in 'Volatile' Series , 2004, IDEAL.

[43]  Dennis Shasha,et al.  StatStream: Statistical Monitoring of Thousands of Data Streams in Real Time , 2002, VLDB.

[44]  J. Murphy Technical Analysis of the Futures Markets: A Comprehensive Guide to Trading Methods and Applications , 1986 .

[45]  Paulo J. Azevedo,et al.  Mining Approximate Motifs in Time Series , 2006, Discovery Science.