论文信息 - On Scalability Issues in Reinforcement Learning for Self-Reconfiguring Modular Robots

On Scalability Issues in Reinforcement Learning for Self-Reconfiguring Modular Robots

Self-reconfiguring modular robots have been receiving great attention because advances in our field are expected to deliver ultra-adaptable and robust systems. There has been remarkable progress in modular hardware and distributed controllers, e.g., [1]–[4], some of which were designed automatically by genetic algorithms, e.g., [1]. But how can the greatest adaptability be achieved? Our position is that modular robots need to run learning algorithms in order to adapt to the changing environment and deliver on the self-organization promise without (much) interference from human designers, programmers and operators. We have developed a reinforcement learning (RL) approach to learning in self-reconfiguring modular robots. There are many scalability challenges in applying RL to our filed. A large number of modules means a large number of learning agents which modify their behavior at the same time, making the underlying process nonstationary. Local policies executed by individual modules need to give rise to coherent global behavior; as the number of modules increases, this property is hard to achieve both by human designers and learning algorithms. Finally, there is a tremendous growth of search spaces as a function of the number of modules in operation. We have been researching techniques to address these scalability issues. Specifically, we have developed two ways to dramatically reduce search spaces and thus simplify the learning problem: an incremental approach to learning, which is made possible specifically by the intrinsic modularity of our systems, and a log-linear representation which can be more universally used. Our results suggest that the learning algorithms could become scalable and produce large, adaptive systems.

L. Kaelbling | D. Rus | Paulina Varshavskaya

[1] L. Penrose,et al. Self-Reproducing Machines , 1959 .

[2] Leslie Pack Kaelbling,et al. Reinforcement Learning by Policy Search , 2002 .

[3] Eiichi Yoshida,et al. Distributed adaptive locomotion by a modular robotic system, M-TRAN II , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[4] D. Rus,et al. Efficient Locomotion for a Self-Reconfiguring Robot , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[5] Marsette Vona,et al. Hierarchical control for self-assembling mobile trusses with passive and active links , 2006, Proceedings 2006 IEEE International Conference on Robotics and Automation, 2006. ICRA 2006..