Deep Recurrent Neural Networks for Time Series Prediction

Ability of deep networks to extract high level features and of recurrent networks to perform time-series inference have been studied. In view of universality of one hidden layer network at approximating functions under weak constraints, the benefit of multiple layers is to enlarge the space of dynamical systems approximated or, given the space, reduce the number of units required for a certain error. Traditionally shallow networks with manually engineered features are used, back-propagation extent is limited to one and attempt to choose a large number of hidden units to satisfy the Markov condition is made. In case of Markov models, it has been shown that many systems need to be modeled as higher order. In the present work, we present deep recurrent networks with longer backpropagation through time extent as a solution to modeling systems that are high order and to predicting ahead. We study epileptic seizure suppression electro-stimulator. Extraction of manually engineered complex features and prediction employing them has not allowed small low-power implementations as, to avoid possibility of surgery, extraction of any features that may be required has to be included. In this solution, a recurrent neural network performs both feature extraction and prediction. We prove analytically that adding hidden layers or increasing backpropagation extent increases the rate of decrease of approximation error. A Dynamic Programming (DP) training procedure employing matrix operations is derived. DP and use of matrix operations makes the procedure efficient particularly when using data-parallel computing. The simulation studies show the geometry of the parameter space, that the network learns the temporal structure, that parameters converge while model output displays same dynamic behavior as the system and greater than .99 Average Detection Rate on all real seizure data tried.

[1]  Yoshua Bengio,et al.  Deep Learning of Representations for Unsupervised and Transfer Learning , 2011, ICML Unsupervised and Transfer Learning.

[2]  Michael Isard,et al.  Nonparametric belief propagation , 2010, Commun. ACM.

[3]  H. White,et al.  There exists a neural network that does not make avoidable mistakes , 1988, IEEE 1988 International Conference on Neural Networks.

[4]  Hugh F. Durrant-Whyte,et al.  A new method for the nonlinear transformation of means and covariances in filters and estimators , 2000, IEEE Trans. Autom. Control..

[5]  Wang Yan,et al.  A Microcontroller-based Implantable Nerve Stimulator Used for Rats , 2005, 2005 IEEE Engineering in Medicine and Biology 27th Annual Conference.

[6]  Robert J. Elliott,et al.  Discrete-Time Nonlinear Filtering Algorithms Using Gauss–Hermite Quadrature , 2007, Proceedings of the IEEE.

[7]  Ionel M. Navon,et al.  Second-Order Information in Data Assimilation* , 2002 .

[8]  Michael P. Flynn,et al.  A wirelessly powered log-based closed-loop deep brain stimulation SoC with two-way wireless telemetry for treatment of neurological disorders , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[9]  G. Lightbody,et al.  A comparison of quantitative EEG features for neonatal seizure detection , 2008, Clinical Neurophysiology.

[10]  A. Walker Electroencephalography, Basic Principles, Clinical Applications and Related Fields , 1982 .

[11]  P J Webros BACKPROPAGATION THROUGH TIME: WHAT IT DOES AND HOW TO DO IT , 1990 .

[12]  Pedro P. Irazoqui,et al.  A hardware-algorithm co-design approach to optimize seizure detection algorithms for implantable applications , 2010, Journal of Neuroscience Methods.

[13]  Rudolph van der Merwe,et al.  Sigma-point kalman filters for probabilistic inference in dynamic state-space models , 2004 .

[14]  Abdulhamit Subasi,et al.  Classification of EEG signals using neural network and logistic regression , 2005, Comput. Methods Programs Biomed..

[15]  Jan M. Rabaey,et al.  A fully-integrated 10.5µW miniaturized (0.125mm2) wireless neural sensor , 2012, 2012 Symposium on VLSI Circuits (VLSIC).

[16]  Felice T. Sun,et al.  Responsive cortical stimulation for the treatment of epilepsy , 2011, Neurotherapeutics.

[17]  Amine Bermak,et al.  Gaussian process for nonstationary time series prediction , 2004, Comput. Stat. Data Anal..

[18]  Bradford F. Kimball Note on Computation of Orthogonal Predictors , 1953 .

[19]  Michael I. Jordan,et al.  Ergodic Subgradient Descent , 2011 .

[20]  Po-Ling Loh,et al.  High-dimensional regression with noisy and missing data: Provable guarantees with non-convexity , 2011, NIPS.

[21]  A. Jazwinski Stochastic Processes and Filtering Theory , 1970 .

[22]  S. Haykin,et al.  Cubature Kalman Filters , 2009, IEEE Transactions on Automatic Control.

[23]  Andrew R. Barron,et al.  Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.