Stochastic Fast Gradient for Tracking

In recent applications, first-order optimization methods are often applied in the non-stationary setting when the minimum point is drifting in time, addressing a so-called parameter tracking, or non-stationary optimization (NSO) problem. In this paper, we propose a new method for NSO derived from Nesterov's Fast Gradient. We derive theoretical bounds on the expected estimation error. We illustrate our results with simulation showing that the proposed method gives more accurate estimates of the minimum points than the unmodified Fast Gradient or Stochastic Gradient in case of deterministic drift while in purely random walk all methods behave similarly. The proposed method can be used to train convolutional neural networks to obtain super-resolution of digital surface models.

[1]  Lei Guo,et al.  Stability of Recursive Stochastic Tracking Algorithms , 1994 .

[2]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[3]  H. Kushner,et al.  Stochastic Approximation and Recursive Algorithms and Applications , 2003 .

[4]  O. Nelles,et al.  An Introduction to Optimization , 1996, IEEE Antennas and Propagation Magazine.

[5]  Yurii Nesterov,et al.  Introductory Lectures on Convex Optimization - A Basic Course , 2014, Applied Optimization.

[6]  Y. Nesterov A method for solving the convex programming problem with convergence rate O(1/k^2) , 1983 .

[7]  Eweda Eweda,et al.  Tracking error bounds of adaptive nonstationary filtering , 1985, Autom..

[8]  Lennart Ljung,et al.  Performance analysis of general tracking algorithms , 1994, Proceedings of 1994 33rd IEEE Conference on Decision and Control.

[9]  Geoffrey E. Hinton A Practical Guide to Training Restricted Boltzmann Machines , 2012, Neural Networks: Tricks of the Trade.

[10]  Byoung-Tak Zhang,et al.  Dual Memory Architectures for Fast Deep Learning of Stream Data via an Online-Incremental-Transfer Strategy , 2015, ArXiv.

[11]  A. Yu. Popkov,et al.  Gradient Methods for Nonstationary Unconstrained Optimization Problems , 2005 .

[12]  Shie Mannor,et al.  A Deep Hierarchical Approach to Lifelong Learning in Minecraft , 2016, AAAI.

[13]  Alexander T. Vakhitov,et al.  A randomized stochastic optimization algorithm: Its estimation accuracy , 2006 .

[14]  Fabio Tozeto Ramos,et al.  Online Adaptation of Deep Architectures with Reinforcement Learning , 2016, ECAI.

[15]  Natalia O. Amelina,et al.  Simultaneous Perturbation Stochastic Approximation for Tracking Under Unknown but Bounded Disturbances , 2015, IEEE Transactions on Automatic Control.

[16]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[17]  Oleg N. Granichin,et al.  Discrete-time minimum tracking based on stochastic approximation algorithm with randomized differences , 2009, Proceedings of the 48h IEEE Conference on Decision and Control (CDC) held jointly with 2009 28th Chinese Control Conference.

[18]  H. Kushner,et al.  Asymptotic Properties of Stochastic Approximations with Constant Coefficients. , 1981 .

[19]  Bernard Delyon,et al.  Asymptotical Study of Parameter Tracking Algorithms , 1995 .