Andhill-98: A RoboCup Team which Reinforces Positioning with Observation
暂无分享,去创建一个
On reinforcement learning with limited exploration, an agent's policy tends to fall into a worthless local optimum. This paper proposes Observational Reinforcement Learning method with which the learning agent evaluates inexperienced policies and reinforces it. This method provides the agent more chances to escape from a local optimum without exploration. Moreover, this paper shows the effectiveness of the method from experiments in the RoboCup positioning problem. They are advanced experiments described in our RoboCup-97 paper[1].
[1] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[2] Tomohito Andou,et al. Refinement of Soccer Agents' Positions Using Reinforcement Learning , 1997, RoboCup.