Machine Learning of Time Series Using Time-Delay Embedding and Precision Annealing

Tasking machine learning to predict segments of a time series requires estimating the parameters of a ML model with input/output pairs from the time series. We borrow two techniques used in statistical data assimilation in order to accomplish this task: time-delay embedding to prepare our input data and precision annealing as a training method. The precision annealing approach identifies the global minimum of the action (-log[P]). In this way, we are able to identify the number of training pairs required to produce good generalizations (predictions) for the time series. We proceed from a scalar time series s(tn);tn=t0+nΔt and, using methods of nonlinear time series analysis, show how to produce a DE>1-dimensional time-delay embedding space in which the time series has no false neighbors as does the observed s(tn) time series. In that DE-dimensional space, we explore the use of feedforward multilayer perceptrons as network models operating on DE-dimensional input and producing DE-dimensional outputs.

[1]  J. Marsden,et al.  Discrete mechanics and variational integrators , 2001, Acta Numerica.

[2]  D. Aeyels GENERIC OBSERVABILITY OF DIFFERENTIABLE SYSTEMS , 1981 .

[3]  J. C. Quinn,et al.  Systematic variational method for statistical nonlinear state and parameter estimation. , 2015, Physical review. E, Statistical, nonlinear, and soft matter physics.

[4]  D. Aeyels On the number of samples necessary to achieve observability , 1981 .

[5]  E. Allgower,et al.  Numerical Continuation Methods , 1990 .

[6]  Henry D. I. Abarbanel,et al.  Predicting the Future , 2013 .

[7]  Eugene L. Allgower,et al.  Numerical continuation methods - an introduction , 1990, Springer series in computational mathematics.

[8]  Fraser,et al.  Independent coordinates for strange attractors from mutual information. , 1986, Physical review. A, General physics.

[9]  Henry D. I. Abarbanel,et al.  Analysis of Observed Chaotic Data , 1995 .

[10]  J. C. Quinn,et al.  Improved variational methods in statistical data assimilation , 2015 .

[11]  P. Laplace Memoir on the Probability of the Causes of Events , 1986 .

[12]  Neil Davey,et al.  Time Series Prediction and Neural Networks , 2001, J. Intell. Robotic Syst..

[13]  J. Popp,et al.  Sample size planning for classification models. , 2012, Analytica chimica acta.

[14]  G. Evensen Data Assimilation: The Ensemble Kalman Filter , 2006 .

[16]  Marc Bocquet,et al.  Data Assimilation: Methods, Algorithms, and Applications , 2016 .

[17]  Nirag Kadakia,et al.  Symplectic structure of statistical variational data assimilation , 2017 .

[18]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[19]  Katta G. Murty,et al.  Some NP-complete problems in quadratic and nonlinear programming , 1987, Math. Program..

[20]  Donald E. Kirk,et al.  Optimal control theory : an introduction , 1970 .

[21]  Keith W. Hipel,et al.  Forecasting nonlinear time series with feed-forward neural networks: a case study of Canadian lynx data , 2005 .

[22]  Henry D. I. Abarbanel,et al.  Predicting the Future: Completing Models of Observed Complex Systems , 2013 .

[23]  Mehmet Emre Çek,et al.  Analysis of observed chaotic data , 2004 .

[24]  Henry D. I. Abarbanel,et al.  Machine Learning as Statistical Data Assimilation , 2017, ArXiv.

[25]  Qing Zeng-Treitler,et al.  Predicting sample size required for classification performance , 2012, BMC Medical Informatics and Decision Making.

[26]  F. Takens Detecting strange attractors in turbulence , 1981 .

[27]  Henry D. I. Abarbanel,et al.  Machine Learning: Deepest Learning as Statistical Data Assimilation Problems , 2017, Neural Computation.