Learning distributed control for modular robots

We propose to automate controller design for distributed modular robots. In this paper, we present some initial experiments with learning distributed controllers for synthesizing compliant locomotion gaits for modular, self-reconfigurable robots. We use both centralized and distributed policy search and find that the learning approach is promising, as locomotion tasks are learnt well. We also find that the additive nature of the robotic platforms can help speed up learning if we increase the robot size incrementally.

[1]  Peter L. Bartlett,et al.  Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..

[2]  Maja J. Mataric,et al.  Reinforcement Learning in the Multi-Robot Domain , 1997, Auton. Robots.

[3]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[4]  Ronald C. Arkin,et al.  Robot behavioral selection using q-learning , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Craig D. McGray,et al.  The self-reconfiguring robotic molecule , 1998, Proceedings. 1998 IEEE International Conference on Robotics and Automation (Cat. No.98CH36146).

[6]  Leslie Pack Kaelbling,et al.  Reinforcement Learning by Policy Search , 2002 .

[7]  Vijay Kumar,et al.  Using policy gradient reinforcement learning on autonomous robot controllers , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[8]  Mark Yim,et al.  PolyBot: a modular reconfigurable robot , 2000, Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings (Cat. No.00CH37065).

[9]  Guillaume J. Laurent,et al.  Learning mixed behaviours with parallel Q-learning , 2002, IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Ronald C. Arkin,et al.  Adaptive multi-robot behavior via learning momentum , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[11]  Benjamin Van Roy,et al.  Distributed Optimization in Adaptive Networks , 2003, NIPS.

[12]  Peter Dayan,et al.  Q-learning , 1992, Machine Learning.

[13]  Wenwei Yu,et al.  Using Interaction-Based Learning to Construct an Adaptive and Fault-Tolerant Multi-Link Floating Robot , 2002, DARS.

[14]  Lynne E. Parker,et al.  Current State of the Art in Distributed Autonomous Mobile Robotics , 2000 .

[15]  Hajime Asama,et al.  A Study of Communication Emergence among Mobile Robots : Simulations of Intention Transmission , 2002, DARS.

[16]  Eiichi Yoshida,et al.  Self-Repairing Mechanical Systems , 1999, Optics East.

[17]  Kazuyuki Ito,et al.  Hybrid autonomous control for heterogeneous multi-agent system , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).