A CMAC-Q-Learning based Dyna agent

In the this paper, a CMAC-Q-learning based Dyna agent is presented to relieve the problem of learning speed in reinforcement learning, in order to achieve the goals of shortening training process and increasing the learning, speed. We combine CMAC, Q-learning, and prioritized sweeping techniques to construct the Dyna agent in which a Q-learning is trained for policy learning; meanwhile, model approximators, called CMAC-model and CMAC-R-model, are in charge of approximating the environment model. The approximated model provides the Q-learning with virtual interaction experience to further update the policy within the time gap when there is no interplay between the agent and the real environment. The Dyna agent switches seamlessly between the real environment and the virtual environment model for the objective of policy learning. A simulation for controlling a differential-drive mobile robot has been conducted to demonstrate that the proposed method can preliminarily achieve the design goal.