Constraint-aware learning of policies by demonstration

Many practical tasks in robotic systems, such as cleaning windows, writing, or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. In this paper, we propose a method of constraint-aware learning that solves the policy learning problem using redundant robots that execute a policy that is acting in the null space of a constraint. In particular, we are interested in generalizing learned null-space policies across constraints that were not known during the training. We split the combined problem of learning constraints and policies into two: first estimating the constraint, and then estimating a null-space policy using the remaining degrees of freedom. For a linear parametrization, we provide a closed-form solution of the problem. We also define a metric for comparing the similarity of estimated constraints, which is useful to pre-process the trajectories recorded in the demonstrations. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force- or torque-based controller to achieve tool alignment. We show that, despite the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion.

[1]  Simon Haykin,et al.  Neural Networks: A Comprehensive Foundation , 1998 .

[2]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[3]  Mustafa Suphi Erden,et al.  Formulation of a Control and Path Planning Approach for a Cab front Cleaning Robot , 2017 .

[4]  Nicolas Mansard,et al.  Task Sequencing for High-Level Sensor-Based Control , 2007, IEEE Transactions on Robotics.

[5]  Sethu Vijayakumar,et al.  Reconstructing Null-space Policies Subject to Dynamic Task Constraints in Redundant Manipulators , 2007 .

[6]  Stefan Schaal,et al.  Learning inverse kinematics , 2001, Proceedings 2001 IEEE/RSJ International Conference on Intelligent Robots and Systems. Expanding the Societal Role of Robotics in the the Next Millennium (Cat. No.01CH37180).

[7]  Chrystopher L. Nehaniv,et al.  Correspondence Mapping Induced State and Action Metrics for Robotic Imitation , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Jan Peters,et al.  Probabilistic Prioritization of Movement Primitives , 2017, IEEE Robotics and Automation Letters.

[9]  Sethu Vijayakumar,et al.  Learning null space projections , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[10]  Tsuneo Yoshikawa,et al.  Manipulability of Robotic Mechanisms , 1985 .

[11]  H. Cruse,et al.  The human arm as a redundant manipulator: The control of path and joint angles , 2004, Biological Cybernetics.

[12]  Sethu Vijayakumar,et al.  A novel method for learning policies from variable constraint data , 2009, Auton. Robots.

[13]  Zvi Shiller,et al.  Off-Line and On-Line Trajectory Planning , 2015 .

[14]  Oussama Khatib,et al.  A Unified Framework for Whole-Body Humanoid Robot Control with Multiple Constraints and Contacts , 2008, EUROS.

[15]  CalinonSylvain A tutorial on task-parameterized movement learning and retrieval , 2016 .

[16]  Pierre-Brice Wieber,et al.  Hierarchical quadratic programming: Fast online humanoid-robot motion generation , 2014, Int. J. Robotics Res..

[17]  Sethu Vijayakumar,et al.  Learning nullspace policies , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Michael Gienger,et al.  Task-oriented whole body motion for humanoid robots , 2005, 5th IEEE-RAS International Conference on Humanoid Robots, 2005..

[19]  Kurt Hornik,et al.  Multilayer feedforward networks are universal approximators , 1989, Neural Networks.

[20]  Stefan Schaal,et al.  Computational approaches to motor learning by imitation. , 2003, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.

[21]  Jun Nakanishi,et al.  Learning Attractor Landscapes for Learning Motor Primitives , 2002, NIPS.

[22]  D.M. Mount,et al.  An Efficient k-Means Clustering Algorithm: Analysis and Implementation , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[23]  Stefan Schaal,et al.  Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[24]  Sylvain Calinon,et al.  A tutorial on task-parameterized movement learning and retrieval , 2016, Intell. Serv. Robotics.

[25]  Samia A. Ali,et al.  Learning from Demonstration Using Variational Bayesian Inference , 2015, IEA/AIE.

[26]  Aude Billard,et al.  Incremental learning of gestures by imitation in a humanoid robot , 2007, 2007 2nd ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[27]  Leopoldo Armesto,et al.  Learning Constrained Generalizable Policies by Demonstration , 2017, Robotics: Science and Systems.

[28]  Christopher G. Atkeson,et al.  Constructive Incremental Learning from Only Local Information , 1998, Neural Computation.

[29]  Andrew W. Moore,et al.  Locally Weighted Learning for Control , 1997, Artificial Intelligence Review.

[30]  Xian-Da Zhang,et al.  Matrix Analysis and Applications , 2017 .

[31]  Hsiu-Chin Lin,et al.  Learning task constraints in operational space formulation , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[32]  Michael Gienger,et al.  Real-Time Self Collision Avoidance for Humanoids by means of Nullspace Criteria and Task Intervals , 2006, 2006 6th IEEE-RAS International Conference on Humanoid Robots.

[33]  Bruno Siciliano,et al.  Differential Kinematics and Statics , 2000 .

[34]  Ronan Boulic,et al.  An inverse kinematics architecture enforcing an arbitrary number of strict priority levels , 2004, The Visual Computer.

[35]  Alexander Herzog,et al.  Momentum control with hierarchical inverse dynamics on a torque-controlled humanoid , 2014, Autonomous Robots.

[36]  Andrej Gams,et al.  Coupling Movement Primitives: Interaction With the Environment and Bimanual Tasks , 2014, IEEE Transactions on Robotics.

[37]  Leopoldo Armesto,et al.  Efficient learning of constraints and generic null space policies , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).