Time Series Featurization via Topological Data Analysis: an Application to Cryptocurrency Trend Forecasting

We propose a novel methodology for feature extraction from time series data based on topological data analysis. The proposed procedure applies a dimensionality reduction technique via principal component analysis to the point cloud of the Takens' embedding from the observed time series and then evaluates the persistence landscape and silhouettes based on the corresponding Rips complex. We define a new notion of Rips distance function that is especially suited for persistence homologies built on Rips complexes and prove stability theorems for it. We use these results to demonstrate in turn some stability properties of the topological features extracted using our procedure with respect to additive noise and sampling. We further apply our method to the problem of trend forecasting for cryptocurrency prices, where we manage to achieve significantly lower error rates than more standard, non TDA-based methodologies in complex pattern classification tasks. We expect our method to provide a new insight on feature engineering for granular, noisy time series data.

[1]  John Nelson,et al.  Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment Analysis , 2018 .

[2]  Brandon Ly Divendra Timaul,et al.  Applying Deep Learning to Better Predict Cryptocurrency Trends , 2018 .

[3]  D. Ruelle,et al.  Fundamental limitations for estimating dimensions and Lyapunov exponents in dynamical systems , 1992 .

[4]  Eric Nielsen,et al.  Cryptocurrency Price Prediction Using News and Social Media Sentiment , 2017 .

[5]  Yuhei Umeda,et al.  Time Series Classification via Topological Data Analysis , 2017, Inf. Media Technol..

[6]  Jiang Wang,et al.  Trading Volume and Serial Correlation in Stock Returns , 1992 .

[7]  D. Burago,et al.  A Course in Metric Geometry , 2001 .

[8]  Marian Gidea,et al.  Topological Data Analysis of Financial Time Series: Landscapes of Crashes , 2017 .

[9]  Brittany Terese Fasy,et al.  Introduction to the R package TDA , 2014, ArXiv.

[10]  James Theiler,et al.  Testing for nonlinearity in time series: the method of surrogate data , 1992 .

[11]  A. Jayawardena,et al.  Analysis and prediction of chaos in rainfall and stream flow time series , 1994 .

[12]  Ruslan Salakhutdinov,et al.  On Characterizing the Capacity of Neural Networks using Algebraic Topology , 2018, ArXiv.

[13]  Ivan Dvořák,et al.  Singular-value decomposition in attractor reconstruction: pitfalls and precautions , 1992 .

[14]  Frédéric Chazal,et al.  Subsampling Methods for Persistent Homology , 2014, ICML.

[15]  Joseph P. Zbilut,et al.  Application of Nonlinear Time Series Analysis Techniques to High-Frequency Currency Exchange Data. , 2002 .

[16]  David Cohen-Steiner,et al.  Stability of Persistence Diagrams , 2005, Discret. Comput. Geom..

[17]  George Sugihara,et al.  Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series , 1990, Nature.

[18]  Jose A. Perea,et al.  (Quasi)Periodicity Quantification in Video Data, Using Topology , 2017, SIAM J. Imaging Sci..

[19]  Schreiber,et al.  Noise reduction in chaotic time-series data: A survey of common methods. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[20]  Jae Woo Lee,et al.  Correlation and network topologies in global and local stock indices , 2014 .

[21]  Rodrigo Fernandes de Mello,et al.  Persistent homology for time series and spatial data clustering , 2015, Expert Syst. Appl..

[22]  Jonathan Rebane,et al.  Seq 2 Seq RNNs and ARIMA models for Cryptocurrency Prediction : A Comparative Study , 2018 .

[23]  Mubarak Shah,et al.  Time series prediction by chaotic modeling of nonlinear dynamical systems , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[24]  Leonidas J. Guibas,et al.  Proximity of persistence modules and their diagrams , 2009, SCG '09.

[25]  Radmila Sazdanovic,et al.  Simplicial Models and Topological Inference in Biological Systems , 2014, Discrete and Topological Models in Molecular Biology.

[26]  Zonghua Liu Chaotic Time Series Analysis , 2010 .

[27]  F. Takens Detecting strange attractors in turbulence , 1981 .

[28]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[29]  T. Ozaki 2 Non-linear time series models and dynamical systems , 1985 .

[30]  Hamid Krim,et al.  Persistent Homology of Delay Embeddings and its Application to Wheeze Detection , 2014, IEEE Signal Processing Letters.

[31]  Frédéric Chazal,et al.  An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists , 2017, Frontiers in Artificial Intelligence.

[32]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[33]  Gunnar E. Carlsson,et al.  Topological pattern recognition for point cloud data* , 2014, Acta Numerica.

[34]  R. K. Agrawal,et al.  An Introductory Study on Time Series Modeling and Forecasting , 2013, ArXiv.

[35]  Michael Small,et al.  Optimal time delay embedding for nonlinear time series modeling , 2003, nlin/0312011.

[36]  Leonidas J. Guibas,et al.  Persistence-Based Clustering in Riemannian Manifolds , 2013, JACM.

[37]  Kenneth A. Brown,et al.  Nonlinear Statistics of Human Speech Data , 2009, Int. J. Bifurc. Chaos.

[38]  Gang Kou,et al.  Manifold Learning-Based phase Space Reconstruction for Financial Time Series , 2014, PACIS.

[39]  Patrick Truong An exploration of topological properties of high-frequency one-dimensional financial time series data using TDA , 2017 .

[40]  James C. Robinson A topological delay embedding theorem for infinite-dimensional dynamical systems , 2005 .

[41]  Matthew Berger,et al.  On Time-Series Topological Data Analysis: New Data and Opportunities , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[42]  James P. Crutchfield,et al.  Geometry from a Time Series , 1980 .

[43]  Karthikeyan Natesan Ramamurthy,et al.  Persistent homology of attractors for action recognition , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[44]  R. Ho Algebraic Topology , 2022 .

[45]  Sneha Gullapalli,et al.  Learning to predict cryptocurrency price using artificial neural network models of time series , 2018 .

[46]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[47]  Frédéric Chazal,et al.  Stochastic convergence of persistence landscapes and silhouettes , 2015, J. Comput. Geom..

[48]  S. Geer,et al.  High-dimensional additive modeling , 2008, 0806.4115.

[49]  Steve Oudot,et al.  Towards persistence-based reconstruction in euclidean spaces , 2007, SCG '08.

[50]  Steve Oudot,et al.  Persistence stability for geometric complexes , 2012, ArXiv.

[51]  Thomas K Torku,et al.  Takens Theorem with Singular Spectrum Analysis Applied to Noisy Time Series , 2016 .

[52]  Martin Casdagli,et al.  Nonlinear prediction of chaotic time series , 1989 .

[53]  Jennifer Gamble,et al.  Exploring uses of persistent homology for statistical analysis of landmark-based shape data , 2010, J. Multivar. Anal..

[54]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[55]  Han Liu,et al.  Nonparametric learning in high dimensions , 2010 .

[56]  Jose A. Perea,et al.  Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis , 2013, Found. Comput. Math..

[57]  Diana Adler Non Linear Time Series A Dynamical System Approach , 2016 .

[58]  Wlodek Zadrozny,et al.  A Short Survey of Topological Data Analysis in Time Series and Systems Analysis , 2018, ArXiv.