Learning Automata Modeling of Distributed Reinforcement Learning Systems with Collective Behavior
暂无分享,去创建一个
In the complex adaptive systems interesting blobal behaviors emerge from many local interactions. When the emergent behavior is a computation, we refer to it as an emergent computation. The premise of the emergent computation is that interesting and useful computational systems can be constructed by exploiting interactions among primitive components. In this paper, we present a collective model of learning automata which is a general reinforcement learning system such as multi-agent control system. In our model, each automaton has a behavioral tactic directed value of utility function that explicitly depends on the automaton's strategy and the corresponding response from automous fields. The goal of the automaton as well as the nature of their interactions is assumed to satisfy certain conditions which ensure the existence and uniqueness of a Nash strategy. For further studying and taking some applications with this model, two formal examples of pursuit-evasion problem and optimized resource assignment problem are presented.
[1] Ben J. A. Kröse,et al. Learning from delayed rewards , 1995, Robotics Auton. Syst..
[2] Sridhar Mahadevan,et al. Automatic Programming of Behavior-Based Robots Using Reinforcement Learning , 1991, Artif. Intell..