Empirical Comparison of Gradient Descent and Exponentiated Gradient Descent in Supervised and Reinforcement Learning
暂无分享,去创建一个
This report describes a series of results using the exponentiated gradient descent (EG) method recently proposed by Kivinen and Warmuth. Prior work is extended by comparing speed of learning on a nonstationary problem and on an extension to backpropagation networks. Most significantly, we present an extension of the EG method to temporal-difference and reinforcement learning. This extension is compared to conventional reinforcement learning methods on two test problems using CMAC function approximators and replace traces. On the larger of the two problems, the average loss was approximately 25% smaller for the EG method. The relative computational complexity and parameter sensitivity of the two methods is also discussed.
[1] A. Moore. Variable Resolution Dynamic Programming , 1991, ML.
[2] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .
[3] James P. Callan,et al. Training algorithms for linear text classifiers , 1996, SIGIR '96.
[4] Richard S. Sutton,et al. Reinforcement Learning with Replacing Eligibility Traces , 2005, Machine Learning.
[5] Manfred K. Warmuth,et al. Exponentiated Gradient Versus Gradient Descent for Linear Predictors , 1997, Inf. Comput..