State initialization for recurrent neural network modeling of time-series data

To use a Recurrent Neural Network (RNN) for time series modeling, it is essential to properly initialize the network, that is, to set the hidden neuron outputs properly at the initial time. Normally, an RNN is initialized with zero state values or at steady state. In the context of dynamic system identification, such initializations imply the system to be modelled is in steady state, i.e., capturing transient behaviour of the system is difficult if the network states are not properly initialized. If the network initial states are not calculable from the training data, then a method to infer them, both throughout the training and validation phases, is needed. In this paper, we use a feed forward neural network to initialize a structurally deep recurrent neural network in learning and multi-step prediction of the altitude of a real quadrotor vehicle. To the best of our knowledge, this is the first time a neural network has outperformed a physics based model for multi-step time series prediction from recorded quadrotor flight data.

[1]  Steven Lake Waslander,et al.  Modelling a Quadrotor Vehicle Using a Modular Deep Recurrent Neural Network , 2015, 2015 IEEE International Conference on Systems, Man, and Cybernetics.

[2]  Santosh S. Venkatesh,et al.  The capacity of the Hopfield associative memory , 1987, IEEE Trans. Inf. Theory.

[3]  Dong Yu,et al.  Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[4]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[5]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[6]  John F. Kolen,et al.  Dynamical Recurrent Networks , 2001 .

[7]  Razvan Pascanu,et al.  How to Construct Deep Recurrent Neural Networks , 2013, ICLR.

[8]  Steven L. Waslander Relative Position-Based Visual Servoing Control for Quadrotors , 2012 .

[9]  Jason Weston,et al.  A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.

[10]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11]  Steven Lake Waslander,et al.  Aerodynamics and control of autonomous quadrotor helicopters in aggressive maneuvering , 2009, 2009 IEEE International Conference on Robotics and Automation.

[12]  Steven Lake Waslander,et al.  Modular deep Recurrent Neural Network: Application to quadrotors , 2014, 2014 IEEE International Conference on Systems, Man, and Cybernetics (SMC).

[13]  Yuichi Nakamura,et al.  Approximation of dynamical systems by continuous time recurrent neural networks , 1993, Neural Networks.

[14]  Lakhmi C. Jain,et al.  Recurrent Neural Networks: Design and Applications , 1999 .

[15]  Kumpati S. Narendra,et al.  Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.

[16]  Claire J. Tomlin,et al.  Precision flight control for a multi-vehicle quadrotor helicopter testbed , 2011 .

[17]  Hans-Georg Zimmermann,et al.  Recurrent Neural Networks are Universal approximators , 2007, Int. J. Neural Syst..

[18]  Luca Maria Gambardella,et al.  Deep Big Simple Neural Nets Excel on Handwritten Digit Recognition , 2010, ArXiv.

[19]  Sauro Longhi,et al.  A Feedback Linearization Approach to Fault Tolerance in Quadrotor Vehicles , 2011 .

[20]  Steven Lake Waslander,et al.  Fault tolerant path following for a quadrotor , 2013, 52nd IEEE Conference on Decision and Control.

[21]  Luca Maria Gambardella,et al.  Deep, Big, Simple Neural Nets for Handwritten Digit Recognition , 2010, Neural Computation.