Time Series Featurization via Topological Data Analysis

We develop a novel algorithm for feature extraction in time series data by leveraging tools from topological data analysis. Our algorithm provides a simple, efficient way to successfully harness topological features of the attractor of the underlying dynamical system for an observed time series. The proposed methodology relies on the persistent landscapes and silhouette of the Rips complex obtained after a de-noising step based on principal components applied to a time-delayed embedding of a noisy, discrete time series sample. We analyze the stability properties of the proposed approach and show that the resulting TDA-based features are robust to sampling noise. Experiments on synthetic and real-world data demonstrate the effectiveness of our approach. We expect our method to provide new insights on feature extraction from granular, noisy time series data.

[1]  Sneha Gullapalli,et al.  Learning to predict cryptocurrency price using artificial neural network models of time series , 2018 .

[2]  Leonidas J. Guibas,et al.  Persistence-Based Clustering in Riemannian Manifolds , 2013, JACM.

[3]  Han Liu,et al.  Nonparametric learning in high dimensions , 2010 .

[4]  R. K. Agrawal,et al.  An Introductory Study on Time Series Modeling and Forecasting , 2013, ArXiv.

[5]  James Theiler,et al.  Testing for nonlinearity in time series: the method of surrogate data , 1992 .

[6]  Kenneth A. Brown,et al.  Nonlinear Statistics of Human Speech Data , 2009, Int. J. Bifurc. Chaos.

[7]  Peter Bubenik,et al.  Statistical topological data analysis using persistence landscapes , 2012, J. Mach. Learn. Res..

[8]  T. Ozaki 2 Non-linear time series models and dynamical systems , 1985 .

[9]  Germain Forestier,et al.  Deep learning for time series classification: a review , 2018, Data Mining and Knowledge Discovery.

[10]  Jose A. Perea,et al.  (Quasi)Periodicity Quantification in Video Data, Using Topology , 2017, SIAM J. Imaging Sci..

[11]  Gunnar E. Carlsson,et al.  Topological pattern recognition for point cloud data* , 2014, Acta Numerica.

[12]  Radmila Sazdanovic,et al.  Simplicial Models and Topological Inference in Biological Systems , 2014, Discrete and Topological Models in Molecular Biology.

[13]  Ruslan Salakhutdinov,et al.  On Characterizing the Capacity of Neural Networks using Algebraic Topology , 2018, ArXiv.

[14]  Frédéric Chazal,et al.  Subsampling Methods for Persistent Homology , 2014, ICML.

[15]  John Nelson,et al.  Cryptocurrency Price Prediction Using Tweet Volumes and Sentiment Analysis , 2018 .

[16]  Brittany Terese Fasy,et al.  Introduction to the R package TDA , 2014, ArXiv.

[17]  Hamid Krim,et al.  Persistent Homology of Delay Embeddings and its Application to Wheeze Detection , 2014, IEEE Signal Processing Letters.

[18]  Gustav Segerstedt,et al.  How accuracy of time-series prediction for cryptocurrency pricing is affected by the sampling period , 2018 .

[19]  A. Jayawardena,et al.  Analysis and prediction of chaos in rainfall and stream flow time series , 1994 .

[20]  F. Takens Detecting strange attractors in turbulence , 1981 .

[21]  Yuhei Umeda,et al.  Time Series Classification via Topological Data Analysis , 2017, Inf. Media Technol..

[22]  Steve Oudot,et al.  Towards persistence-based reconstruction in euclidean spaces , 2007, SCG '08.

[23]  Frédéric Chazal,et al.  Stochastic convergence of persistence landscapes and silhouettes , 2015, J. Comput. Geom..

[24]  Wlodek Zadrozny,et al.  A Short Survey of Topological Data Analysis in Time Series and Systems Analysis , 2018, ArXiv.

[25]  Zonghua Liu Chaotic Time Series Analysis , 2010 .

[26]  R. Ho Algebraic Topology , 2022 .

[27]  Jennifer Gamble,et al.  Exploring uses of persistent homology for statistical analysis of landmark-based shape data , 2010, J. Multivar. Anal..

[28]  James P. Crutchfield,et al.  Geometry from a Time Series , 1980 .

[29]  Xiaojin Zhu,et al.  Persistent Homology: An Introduction and a New Text Representation for Natural Language Processing , 2013, IJCAI.

[30]  Marian Gidea,et al.  Topological Data Analysis of Financial Time Series: Landscapes of Crashes , 2017 .

[31]  Jae Woo Lee,et al.  Correlation and network topologies in global and local stock indices , 2014 .

[32]  Michael Small,et al.  Optimal time delay embedding for nonlinear time series modeling , 2003, nlin/0312011.

[33]  Karthikeyan Natesan Ramamurthy,et al.  Persistent homology of attractors for action recognition , 2016, 2016 IEEE International Conference on Image Processing (ICIP).

[34]  Rodrigo Fernandes de Mello,et al.  Persistent homology for time series and spatial data clustering , 2015, Expert Syst. Appl..

[35]  George Sugihara,et al.  Nonlinear forecasting as a way of distinguishing chaos from measurement error in time series , 1990, Nature.

[36]  Schreiber,et al.  Noise reduction in chaotic time-series data: A survey of common methods. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[37]  Leonidas J. Guibas,et al.  Proximity of persistence modules and their diagrams , 2009, SCG '09.

[38]  D. Burago,et al.  A Course in Metric Geometry , 2001 .

[39]  Jiang Wang,et al.  Trading Volume and Serial Correlation in Stock Returns , 1992 .

[40]  David Cohen-Steiner,et al.  Stability of Persistence Diagrams , 2005, Discret. Comput. Geom..

[41]  Patrick Truong An exploration of topological properties of high-frequency one-dimensional financial time series data using TDA , 2017 .

[42]  Amy Loutfi,et al.  A review of unsupervised feature learning and deep learning for time-series modeling , 2014, Pattern Recognit. Lett..

[43]  Frédéric Chazal,et al.  An Introduction to Topological Data Analysis: Fundamental and Practical Aspects for Data Scientists , 2017, Frontiers in Artificial Intelligence.

[44]  Mubarak Shah,et al.  Time series prediction by chaotic modeling of nonlinear dynamical systems , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[45]  James C. Robinson A topological delay embedding theorem for infinite-dimensional dynamical systems , 2005 .

[46]  Matthew Berger,et al.  On Time-Series Topological Data Analysis: New Data and Opportunities , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[47]  S. Geer,et al.  High-dimensional additive modeling , 2008, 0806.4115.

[48]  Diana Adler Non Linear Time Series A Dynamical System Approach , 2016 .

[49]  H. Kantz,et al.  Nonlinear time series analysis , 1997 .

[50]  Martin Casdagli,et al.  Nonlinear prediction of chaotic time series , 1989 .

[51]  Gang Kou,et al.  Manifold Learning-Based phase Space Reconstruction for Financial Time Series , 2014, PACIS.

[52]  Herbert Edelsbrunner,et al.  Computational Topology - an Introduction , 2009 .

[53]  Ivan Dvořák,et al.  Singular-value decomposition in attractor reconstruction: pitfalls and precautions , 1992 .

[54]  Tengyao Wang,et al.  A useful variant of the Davis--Kahan theorem for statisticians , 2014, 1405.0680.

[55]  Peter Bubenik,et al.  The Persistence Landscape and Some of Its Properties , 2018, Topological Data Analysis.

[56]  D. Ruelle,et al.  Fundamental limitations for estimating dimensions and Lyapunov exponents in dynamical systems , 1992 .

[57]  N. Lovell,et al.  Chapter 1 NONLINEAR DYNAMICS TIME SERIES ANALYSIS , 2005 .

[58]  Jose A. Perea,et al.  Sliding Windows and Persistence: An Application of Topological Methods to Signal Analysis , 2013, Found. Comput. Math..

[59]  Thomas K Torku,et al.  Takens Theorem with Singular Spectrum Analysis Applied to Noisy Time Series , 2016 .

[60]  Steve Oudot,et al.  Persistence stability for geometric complexes , 2012, ArXiv.

[61]  Joseph P. Zbilut,et al.  Application of Nonlinear Time Series Analysis Techniques to High-Frequency Currency Exchange Data. , 2002 .

[62]  Brandon Ly Divendra Timaul,et al.  Applying Deep Learning to Better Predict Cryptocurrency Trends , 2018 .