论文信息 - Constraint-aware learning of policies by demonstration

Constraint-aware learning of policies by demonstration

Many practical tasks in robotic systems, such as cleaning windows, writing, or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. In this paper, we propose a method of constraint-aware learning that solves the policy learning problem using redundant robots that execute a policy that is acting in the null space of a constraint. In particular, we are interested in generalizing learned null-space policies across constraints that were not known during the training. We split the combined problem of learning constraints and policies into two: first estimating the constraint, and then estimating a null-space policy using the remaining degrees of freedom. For a linear parametrization, we provide a closed-form solution of the problem. We also define a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force- or torque-based controller to achieve tool alignment. We show that, despite the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion.

[1] Simon Haykin,et al. Neural Networks: A Comprehensive Foundation , 1998 .

[2] Brett Browning,et al. A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[3] Mustafa Suphi Erden,et al. Formulation of a Control and Path Planning Approach for a Cab front Cleaning Robot , 2017 .

[4] Nicolas Mansard,et al. Task Sequencing for High-Level Sensor-Based Control , 2007, IEEE Transactions on Robotics.

[5] Sethu Vijayakumar,et al. Reconstructing Null-space Policies Subject to Dynamic Task Constraints in Redundant Manipulators , 2007 .

[6] Stefan Schaal,et al. Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[7] Chrystopher L. Nehaniv,et al. Correspondence Mapping Induced State and Action Metrics for Robotic Imitation , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8] Jan Peters,et al. Probabilistic Prioritization of Movement Primitives , 2017, IEEE Robotics and Automation Letters.

[9] Sethu Vijayakumar,et al. Learning null space projections , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10] Tsuneo Yoshikawa,et al. Manipulability of Robotic Mechanisms , 1985 .

[11] H. Cruse,et al. The human arm as a redundant manipulator: The control of path and joint angles , 2004, Biological Cybernetics.

[12] Sethu Vijayakumar,et al. A novel method for learning policies from variable constraint data , 2009, Auton. Robots.

[13] Zvi Shiller,et al. Off-Line and On-Line Trajectory Planning , 2015 .

[14] Oussama Khatib,et al. A Unified Framework for Whole-Body Humanoid Robot Control with Multiple Constraints and Contacts , 2008, EUROS.

[15] CalinonSylvain. A tutorial on task-parameterized movement learning and retrieval , 2016 .

[16] Pierre-Brice Wieber,et al. Hierarchical quadratic programming: Fast online humanoid-robot motion generation , 2014, Int. J. Robotics Res..

[17] Sethu Vijayakumar,et al. Learning nullspace policies , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18] Michael Gienger,et al. Task-oriented whole body motion for humanoid robots , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[19] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[20] Stefan Schaal,et al. Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[21] Jun Nakanishi,et al. Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[22] D.M. Mount,et al. An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23] Stefan Schaal,et al. Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24] Sylvain Calinon,et al. A tutorial on task-parameterized movement learning and retrieval , 2016, Intell. Serv. Robotics.

[25] Samia A. Ali,et al. Learning from Demonstration Using Variational Bayesian Inference , 2015, IEA/AIE.

[26] Aude Billard,et al. Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[27] Leopoldo Armesto,et al. Learning Constrained Generalizable Policies by Demonstration , 2017, Robotics: Science and Systems.

[28] Christopher G. Atkeson,et al. Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[29] Andrew W. Moore,et al. Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[30] Xian-Da Zhang,et al. Matrix Analysis and Applications , 2017 .

[31] Hsiu-Chin Lin,et al. Learning task constraints in operational space formulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[32] Michael Gienger,et al. Real-Time Self Collision Avoidance for Humanoids by means of Nullspace Criteria and Task Intervals , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[33] Bruno Siciliano,et al. Differential Kinematics and Statics , 2000 .

[34] Ronan Boulic,et al. An inverse kinematics architecture enforcing an arbitrary number of strict priority levels , 2004, The Visual Computer.

[35] Alexander Herzog,et al. Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid , 2014, Autonomous Robots.

[36] Andrej Gams,et al. Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks , 2014, IEEE Transactions on Robotics.

[37] Leopoldo Armesto,et al. Efficient learning of constraints and generic null space policies , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).