Shaping Bayesian Network Based Reinforcement Learning

The trial-and-error mechanism of reinforcement learning essentially is a kind of exhaustive search, which is also the major reason to cause reinforcement learning being slow and time-consuming. We present an approach to model the state transitions in agent's exploration by Shaping Bayesian Network, which can be used to shape agent for bias exploration towards the most promising regions of state space and thereby reduce exploration and accelerate learning. The experiment results show this approach can significantly improve agent's performance and shorten learning time. More importantly, this approach provides a kind of way to make agent can take advantage of its own experience to accelerate learning.