An Actor-Critic Algorithm Using a Binary Tree Action Selector
暂无分享,去创建一个
[1] 計測自動制御学会. 計測と制御 = Journal of the Society of Instrument and Control Engineers , 1962 .
[2] Richard S. Sutton,et al. Neuronlike adaptive elements that can solve difficult learning control problems , 1983, IEEE Transactions on Systems, Man, and Cybernetics.
[3] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[4] Richard S. Sutton,et al. Generalization in Reinforcement Learning: Successful Examples Using Sparse Coarse Coding , 1995, NIPS.
[5] John N. Tsitsiklis,et al. Analysis of Temporal-Diffference Learning with Function Approximation , 1996, NIPS.
[6] Shigenobu Kobayashi,et al. An Analysis of Actor/Critic Algorithms Using Eligibility Traces: Reinforcement Learning with Imperfect Value Function , 1998, ICML.
[7] N. Adachi,et al. Reinforcement Learning Using Regularization Theory to Treat the Continuous States and Actions , 1998 .
[8] Leslie Pack Kaelbling,et al. Learning Policies with External Memory , 1999, ICML.
[9] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[10] Osamu Katai,et al. Fuzzy Interpolation-Based Q-Learning with Continuous Inputs and Outputs , 1999 .