Emergence of delayed reward learning from sensorimotor coordination

When building autonomous robotic agents, designers almost always make assumptions about how the control system relates sensory information to motor actions, in order to provide the agent with a set of basic behaviors. This raises the question of how arbitrary these assumptions are, and to what extent the introduced biases reduce the potential for the generation of new behaviors arising from interaction with the environment. In this paper, we propose a new model of robot control architecture, consisting merely of homogeneous, non-hierarchical sensorimotor coupling. We show that a robot using this model can display coherent complex behaviors which emerge from the agent-environment interaction, such as tracking an object, or solving a task based on the temporal relationship between an early clue and a delayed reward.

[1]  Rolf Pfeifer,et al.  Understanding intelligence , 2020, Inequality by Design.

[2]  Pattie Maes,et al.  Handling Time-Warped Sequences with Neural Networks , 1996 .

[3]  Claudia Ulbricht Handling Time-Warped Sequences with Neural Networks , 1996 .

[4]  Fumiya Iida,et al.  New Robotics: Design Principles for Intelligent Systems , 2005, Artificial Life.

[5]  R. Pfeifer,et al.  A mobile robot employing insect strategies for navigation , 2000, Robotics Auton. Syst..

[6]  V. Braitenberg Vehicles, Experiments in Synthetic Psychology , 1984 .

[7]  Viktor Mikhaĭlovich Glushkov,et al.  An Introduction to Cybernetics , 1957, The Mathematical Gazette.

[8]  Yasuo Kuniyoshi,et al.  From Humanoid Embodiment to Theory of Mind , 2003, Embodied Artificial Intelligence.

[9]  DaeEun Kim,et al.  Evolving internal memory for T-maze tasks in noisy environments , 2004, Connect. Sci..

[10]  Rolf Pfeifer,et al.  An Investigation into Obstacle Avoidance as an ‘ Emergent ’ Behaviour from two Different Perspectives , 2002 .

[11]  V. Hafner,et al.  The Artificial Mouse - A Robot with Whiskers and Vision , 2004 .

[12]  Paul F. M. J. Verschure,et al.  Environmentally mediated synergy between perception and behaviour in mobile robots , 2003, Nature.

[13]  Rodney A. Brooks,et al.  Intelligence Without Reason , 1991, IJCAI.

[14]  Chris A. Czarnecki,et al.  Embedding Connectionist Autonomous Agents in Time: The ‘Road Sign Problem’ , 2000, Neural Processing Letters.

[15]  Rodney A. Brooks,et al.  A Robust Layered Control Syste For A Mobile Robot , 2022 .