Fast Online Policy Gradient Learning with SMD Gain Vector Adaptation
暂无分享,去创建一个
[1] Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.
[2] Luís B. Almeida,et al. Acceleration Techniques for the Backpropagation Algorithm , 1990, EURASIP Workshop.
[3] Tom Tollenaere,et al. SuperSAB: Fast adaptive back propagation with good scaling properties , 1990, Neural Networks.
[4] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[5] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[6] Manfred K. Warmuth,et al. Additive versus exponentiated gradient updates for linear prediction , 1995, STOC '95.
[7] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[8] Thibault Langlois,et al. Parameter adaptation in stochastic optimization , 1999 .
[9] Nicol N. Schraudolph,et al. Local Gain Adaptation in Stochastic Gradient Descent , 1999 .
[10] Andreas Griewank,et al. Evaluating derivatives - principles and techniques of algorithmic differentiation, Second Edition , 2000, Frontiers in applied mathematics.
[11] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[12] Peter L. Bartlett,et al. Experiments with Infinite-Horizon, Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[13] Sham M. Kakade,et al. A Natural Policy Gradient , 2001, NIPS.
[14] Nicol N. Schraudolph,et al. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.
[15] Nicol N. Schraudolph,et al. Combining Conjugate Direction Methods with Stochastic Approximation of Gradients , 2003, AISTATS.
[16] R. Sutton. Gain Adaptation Beats Least Squares , 2006 .