Switching reinforcement learning for continuous action space

Reinforcement learning (RL) is attracting attention as a technique for realizing computational intelligence, such as adaptive and autonomous decentralized systems. In general, however, it is not easy to put RL to practical use. The difficulty includes the problem of designing a suitable action space for an agent, that is, satisfying two requirements in trade-off: (i) to keep the characteristics (or structure) of the original search space as much as possible in order to seek strategies that lie close to the optimal, and (ii) to reduce the search space as much as possible in order to expedite the learning process. In order to design a suitable action space adaptively, we propose the Switching RL model to mimic the process of an infant's motor development, in which gross motor skills develop before fine motor skills. Then, a method for switching controllers is constructed by introducing and referring to the “entropy.” Further, the validity of the proposed method is confirmed by computational experiments using robot navigation problems with one- and two-dimensional continuous action spaces. © 2012 Wiley Periodicals, Inc. Electron Comm Jpn, 95(3): 37–44, 2012; Published online in Wiley Online Library (wileyonlinelibrary.com). DOI 10.1002/ecj.10383