Natural gradient actor-critic algorithms using random rectangular coarse coding

Learning performance of natural gradient actor-critic algorithms is outstanding especially in high-dimensional spaces than conventional actor-critic algorithms. However, representation issues of stochastic policies or value functions are remaining because the actor-critic approaches need to design it carefully. The author has proposed random rectangular coarse coding, that is very simple and suited for approximating Q-values in high-dimensional state-action space. This paper shows a quantitative analysis of the random coarse coding comparing with regular-grid approaches, and presents a new approach that combines the natural gradient actor-critic with the random rectangular coarse coding.

[1]  John N. Tsitsiklis,et al.  Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.

[2]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[3]  Peter Dayan,et al.  Technical Note: Q-Learning , 2004, Machine Learning.

[4]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[5]  Shalabh Bhatnagar,et al.  Incremental Natural Actor-Critic Algorithms , 2007, NIPS.

[6]  Kimura Kimura Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and gibbs sampling , 2007, SICE Annual Conference 2007.

[7]  L. Williams,et al.  Contents , 2020, Ophthalmology (Rochester, Minn.).