Application of Q-Learning in Robot Grasping Tasks

Reinforcement learning plays a major part in the adaptive behaviour of autonomous robots. But in real-world environments reinforcement learning techniques, like Q-learning, meet great difficulties because of rapidly growing search spaces. We explored the characteristics of discrete Q-learning in a high-dimensional continuous setup consisting of a simulated robot grasping task. Very simple sensors in this setup allow only a rather coarse identification of the actual “physical” state, thus leading to consequences known as “perceptual aliasing”. We identified parameters — especially the sensory sampling-rate — directly controlling the grade of generality of the policies to be learned. Actually in case of the more general policies performing the raw positioning effects of ambiguity can be suppressed. So the system can find feasible grasping positions after few explorative actions and we suggest, that a neural net supporting Q-learning could improve the overall performance significantly.