论文信息 - Evaluating techniques for learning a feedback controller for low-cost manipulators

Evaluating techniques for learning a feedback controller for low-cost manipulators

Robust manipulation with tractability in unstructured environments is a prominent hurdle in robotics. Learning algorithms to control robotic arms have introduced elegant solutions to the complexities faced in such systems. A novel method of Reinforcement Learning (RL), Gaussian Process Dynamic Programming (GPDP), yields promising results for closed-loop control of a low-cost manipulator however research surrounding most RL techniques lack breadth of comparable experiments into the viability of particular learning techniques on equivalent environments. We introduce several model-based learning agents as mechanisms to control a noisy, low-cost robotic system. The agents were tested in a simulated domain for learning closed-loop policies of a simple task with no prior information. Then, the fidelity of the simulations is confirmed by application of GPDP to a physical system.

Oliver M. Cliff | Monteiro | T. Sildomar | T. Sildomar | Monteiro

[1] J. R. Quinlan. Learning With Continuous Classes , 1992 .

[2] H. JoséAntonioMartín,et al. A distributed reinforcement learning control architecture for multi-link robots - experimental validation , 2007, ICINCO-ICSO.

[3] Carl E. Rasmussen,et al. Gaussian process dynamic programming , 2009, Neurocomputing.

[4] Carl E. Rasmussen,et al. Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[5] J. Ross Quinlan,et al. Induction of Decision Trees , 1986, Machine Learning.

[6] Chris Watkins,et al. Learning from delayed rewards , 1989 .

[7] Mahesan Niranjan,et al. On-line Q-learning using connectionist systems , 1994 .

[8] Csaba Szepesvári,et al. Bandit Based Monte-Carlo Planning , 2006, ECML.

[9] Marc Peter Deisenroth,et al. Efficient reinforcement learning using Gaussian processes , 2010 .

[10] Peter Stone,et al. Real time targeted exploration in large domains , 2010, 2010 IEEE 9th International Conference on Development and Learning.

[11] Andrea Lockerd Thomaz,et al. Teachable robots: Understanding human teaching behavior to build more effective robot learners , 2008, Artif. Intell..

[12] Javier de Lope,et al. A distributed reinforcement learning control architecture for multi-link robots - experimental validation. , 2007 .

[13] Jan Peters,et al. Noname manuscript No. (will be inserted by the editor) Policy Search for Motor Primitives in Robotics , 2022 .

[14] Carl E. Rasmussen,et al. PILCO: A Model-Based and Data-Efficient Approach to Policy Search , 2011, ICML.

[15] Siddhartha S. Srinivasa,et al. Autonomous manipulation with a general-purpose simple hand , 2011, Int. J. Robotics Res..

[16] Carl E. Rasmussen,et al. Learning to Control a Low-Cost Manipulator using Data-Efficient Reinforcement Learning , 2011, Robotics: Science and Systems.

[17] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[18] Pieter Abbeel,et al. Cloth grasp point detection based on multiple-view geometric cues with application to robotic towel folding , 2010, 2010 IEEE International Conference on Robotics and Automation.

[19] Jason Pazis,et al. Reinforcement learning in multidimensional continuous action spaces , 2011, 2011 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL).

[20] Richard S. Sutton,et al. Integrated Architectures for Learning, Planning, and Reacting Based on Approximating Dynamic Programming , 1990, ML.

[21] Kimura Kimura. Reinforcement learning in multi-dimensional state-action space using random rectangular coarse coding and gibbs sampling , 2007, SICE Annual Conference 2007.