Learning to Use a Ratchet by Modeling Spatial Relations in Demonstrations

We introduce a framework where visual features, describing the interaction among a robot hand, a tool, and an assembly fixture, can be learned efficiently using a small number of demonstrations. We illustrate the approach by torquing a bolt with the Robonaut-2 humanoid robot using a handheld ratchet. The difficulties include the uncertainty of the ratchet pose after grasping and the high precision required for mating the socket to the bolt and replacing the tool in the tool holder. Our approach learns the desired relative position between visual features on the ratchet and the bolt. It does this by identifying goal offsets from visual features that are consistently observable over a set of demonstrations. With this approach we show that Robonaut-2 is capable of grasping the ratchet, tightening a bolt, and putting the ratchet back into a tool holder. We measure the accuracy of the socket-bolt mating subtask over multiple demonstrations and show that a small set of demonstrations can decrease the error significantly.

[1]  Roderic A. Grupen,et al.  A hybrid architecture for adaptive robot control , 2000 .

[2]  A. Roderic,et al.  Action-Based Models for Belief-Space Planning , 2014 .

[3]  Iasonas Kokkinos,et al.  Discriminative Learning of Deep Convolutional Feature Point Descriptors , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[4]  Aude Billard,et al.  A probabilistic Programming by Demonstration framework handling constraints in joint space and task space , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5]  Hod Lipson,et al.  Understanding Neural Networks Through Deep Visualization , 2015, ArXiv.

[6]  Thomas Brox,et al.  Striving for Simplicity: The All Convolutional Net , 2014, ICLR.

[7]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[8]  Roderic A. Grupen,et al.  An aspect representation for object manipulation based on convolutional neural networks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[9]  Thomas Brox,et al.  Descriptor Matching with Convolutional Neural Networks: a Comparison to SIFT , 2014, ArXiv.

[10]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[11]  Maya Cakmak,et al.  Trajectories and keyframes for kinesthetic teaching: A human-robot interaction perspective , 2012, 2012 7th ACM/IEEE International Conference on Human-Robot Interaction (HRI).

[12]  Shiraj Sen,et al.  Bridging the gap between autonomous skill learning and task-specific planning , 2013 .

[13]  Stefan Schaal,et al.  Learning and generalization of motor skills by learning from demonstration , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[15]  Maya Cakmak,et al.  Robot Programming by Demonstration with Interactive Action Visualizations , 2014, Robotics: Science and Systems.

[16]  Sergey Levine,et al.  Deep spatial autoencoders for visuomotor learning , 2015, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[17]  Maya Cakmak,et al.  Flexible user specification of perceptual landmarks for robot manipulation , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[18]  Maxim Likhachev,et al.  Learning to Plan for Constrained Manipulation from Demonstrations , 2013, Robotics: Science and Systems.

[19]  Aude Billard,et al.  On Learning, Representing, and Generalizing a Task in a Humanoid Robot , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[20]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[21]  Julie A. Shah,et al.  C-LEARN: Learning geometric constraints from demonstrations for multi-step manipulation in shared autonomy , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Roderic A. Grupen,et al.  Associating grasp configurations with hierarchical features in convolutional neural networks , 2016, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).