论文信息 - A reinforcement learning approach towards autonomous suspended load manipulation using aerial robots

A reinforcement learning approach towards autonomous suspended load manipulation using aerial robots

In this paper, we present a problem where a suspended load, carried by a rotorcraft aerial robot, performs trajectory tracking. We want to accomplish this by specifying the reference trajectory for the suspended load only. The aerial robot needs to discover/learn its own trajectory which ensures that the suspended load tracks the reference trajectory. As a solution, we propose a method based on least-square policy iteration (LSPI) which is a type of reinforcement learning algorithm. The proposed method is verified through simulation and experiments.

[1] R. Bellman. Dynamic programming. , 1957, Science.

[2] Pieter Abbeel,et al. Autonomous Helicopter Aerobatics through Apprenticeship Learning , 2010, Int. J. Robotics Res..

[3] Jonathan P. How,et al. Mission Health Management for 24/7 Persistent Surveillance Operations , 2007 .

[4] Ronald Lumia,et al. Rapid Transport of Suspended Payloads , 2005, Proceedings of the 2005 IEEE International Conference on Robotics and Automation.

[5] Warren B. Powell,et al. A convergent recursive least squares approximate policy iteration algorithm for multi-dimensional Markov decision process with continuous state and action spaces , 2009, 2009 IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning.

[6] Bart De Schutter,et al. Reinforcement Learning and Dynamic Programming Using Function Approximators , 2010 .

[7] Angela Scḧollig,et al. A Platform for Dance Performances with Multiple Quadrocopters , 2010 .

[8] Warrren B Powell,et al. A review of stochastic algorithms with continuous value function approximation and some new approximate policy iteration algorithms for multidimensional continuous applications , 2011 .

[9] Rafael Fierro,et al. Agile Load Transportation : Safe and Efficient Load Manipulation with Aerial Robots , 2012, IEEE Robotics & Automation Magazine.

[10] Roland Siegwart,et al. Design and control of an indoor micro quadrotor , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.

[11] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..

[12] Rafael Fierro,et al. Trajectory generation for swing-free maneuvers of a quadrotor with suspended payload: A dynamic programming approach , 2012, 2012 IEEE International Conference on Robotics and Automation.

[13] Raffaello D'Andrea,et al. A simple learning strategy for high-speed quadrocopter multi-flips , 2010, 2010 IEEE International Conference on Robotics and Automation.

[14] Claire J. Tomlin,et al. Quadrotor Helicopter Trajectory Tracking Control , 2008 .

[15] Ronald Lumia,et al. Rapid Swing-Free Transport of Nonlinear Payloads Using Dynamic Programming , 2008 .

[16] Warrren B Powell,et al. Convergence Analysis of On-Policy LSPI for Multi-Dimensional Continuous State and Action-Space MDPs and Extension with Orthogonal Polynomial Approximation , 2010 .

[17] Vijay Kumar,et al. Trajectory generation and control for precise aggressive maneuvers with quadrotors , 2012, Int. J. Robotics Res..