A Reinforcement Learning Based Algorithm for Robot Action Planning

The learning process that arises in response to the visual perception of the environment is the starting point for numerous research in the field of applied and cognitive robotics. In this research, we propose a reinforcement learning based action planning algorithm for the assembly of spatial structures with an autonomous robot in an unstructured environment. We have developed an algorithm based on temporal difference learning using linear base functions for the approximation of the state-value-function because of a large number of discrete states that the autonomous robot can encounter. The aim is to find the optimal sequence of actions that the agent (robot) needs to take in order to move objects in a 2D environment until they reach the predefined target state. The algorithm is divided into two parts. In the first part, the goal is to learn the parameters in order to properly approximate the Q function. In the second part of the algorithm, the obtained parameters are used to define the sequence of actions for a UR3 robot arm. We present a preliminary validation of the algorithm in an experimental laboratory scenario.

[1]  Zoran Miljkovic,et al.  Neural network Reinforcement Learning for visual control of robot manipulators , 2013, Expert Syst. Appl..

[2]  Jan Peters,et al.  Reinforcement learning in robotics: A survey , 2013, Int. J. Robotics Res..

[3]  N. H. C. Yung,et al.  A fuzzy controller with supervised learning assisted reinforcement learning algorithm for obstacle avoidance , 2003, IEEE Trans. Syst. Man Cybern. Part B.

[4]  Katarina Grolinger,et al.  Autonomous agent based on reinforcement learning and adaptive shadowed network , 1999, Artif. Intell. Eng..

[5]  Bojan Jerbić,et al.  Autonomous Robotic Assembly Using Collaborative Behavior Based Agents , 2002 .

[6]  Scott Kuindersma,et al.  Robot learning from demonstration by constructing skill trees , 2012, Int. J. Robotics Res..

[7]  Bojan Jerbić,et al.  AUTONOMOUS ROBOT LEARNING MODEL BASED ON VISUAL INTERPRETATION OF SPATIAL STRUCTURES , 2014 .

[8]  Nando de Freitas,et al.  A Tutorial on Bayesian Optimization of Expensive Cost Functions, with Application to Active User Modeling and Hierarchical Reinforcement Learning , 2010, ArXiv.

[9]  Richard S. Sutton,et al.  Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.

[10]  Minoru Asada,et al.  Vision-based reinforcement learning for purposive behavior acquisition , 1995, Proceedings of 1995 IEEE International Conference on Robotics and Automation.

[11]  Bojan Jerbić,et al.  ARTgrid: A Two-Level Learning Architecture Based on Adaptive Resonance Theory , 2014, Adv. Artif. Neural Syst..

[12]  Jun Morimoto,et al.  Acquisition of stand-up behavior by a real robot using hierarchical reinforcement learning , 2000, Robotics Auton. Syst..

[13]  Bram Bakker,et al.  Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .

[14]  Frank L. Lewis,et al.  Reinforcement learning and optimal adaptive control: An overview and implementation examples , 2012, Annu. Rev. Control..

[15]  Bojan Jerbić,et al.  Task planning based on the interpretation of spatial structures , 2017 .

[16]  Darwin G. Caldwell,et al.  Reinforcement Learning in Robotics: Applications and Real-World Challenges , 2013, Robotics.

[17]  Carl E. Rasmussen,et al.  Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[18]  Damjan Miklic,et al.  Decentralized grid-based algorithms for formation reconfiguration and synchronization , 2010, 2010 IEEE International Conference on Robotics and Automation.

[19]  Danica Kragic,et al.  Robot Learning from Demonstration: A Task-level Planning Approach , 2008 .

[20]  Gheorghe Leonte Mogan,et al.  Obstacle avoidance of redundant manipulators using neural networks based reinforcement learning , 2012 .