Teaching a Robot the Semantics of Assembly Tasks

We present a three-level cognitive system in a learning by demonstration context. The system allows for learning and transfer on the sensorimotor level as well as the planning level. The fundamentally different data structures associated with these two levels are connected by an efficient mid-level representation based on so-called “semantic event chains.” We describe details of the representations and quantify the effect of the associated learning procedures for each level under different amounts of noise. Moreover, we demonstrate the performance of the overall system by three demonstrations that have been performed at a project review. The described system has a technical readiness level (TRL) of 4, which in an ongoing follow-up project will be raised to TRL 6.

[1]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[2]  Yaakov Bar-Shalom,et al.  Multi-target tracking using joint probabilistic data association , 1980, 1980 19th IEEE Conference on Decision and Control including the Symposium on Adaptive Processes.

[3]  Roderic A. Grupen,et al.  Learning reactive admittance control , 1992, Proceedings 1992 IEEE International Conference on Robotics and Automation.

[4]  Gérard G. Medioni,et al.  Structural Indexing: Efficient 3-D Object Recognition , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Paul J. Besl,et al.  A Method for Registration of 3-D Shapes , 1992, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Ales Ude,et al.  Acquisition of Elementary Robot Skills from Human Demonstration , 1995 .

[7]  Jan F. Broenink,et al.  Peg-in-Hole assembly using Impedance Control with a 6 DOF Robot , 1996 .

[8]  Wyatt S. Newman,et al.  Force-responsive robotic assembly of transmission components , 1999, Proceedings 1999 IEEE International Conference on Robotics and Automation (Cat. No.99CH36288C).

[9]  Andrew E. Johnson,et al.  Using Spin Images for Efficient Object Recognition in Cluttered 3D Scenes , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Stefan Schaal,et al.  Is imitation learning the route to humanoid robots? , 1999, Trends in Cognitive Sciences.

[11]  Sven Horstmann,et al.  Towards interactive learning for manufacturing assistants , 2001, Proceedings 10th IEEE International Workshop on Robot and Human Interactive Communication. ROMAN 2001 (Cat. No.01TH8591).

[12]  James W. Davis,et al.  The Recognition of Human Movement Using Temporal Templates , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Ronen I. Brafman,et al.  R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..

[14]  Alessandro Saffiotti,et al.  An introduction to the anchoring problem , 2003, Robotics Auton. Syst..

[15]  Steen Kristensen,et al.  Toward interactive learning for manufacturing assistants , 2003, IEEE Trans. Ind. Electron..

[16]  Rüdiger Dillmann,et al.  Teaching and learning of robot tasks via observation of human performance , 2004, Robotics Auton. Syst..

[17]  Michael Isard,et al.  CONDENSATION—Conditional Density Propagation for Visual Tracking , 1998, International Journal of Computer Vision.

[18]  Andrew Howard,et al.  Design and use paradigms for Gazebo, an open-source multi-robot simulator , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[19]  Svetha Venkatesh,et al.  Human action segmentation via controlled use of missing data in HMMs , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[20]  A. Doucet,et al.  Sequential Monte Carlo methods for multitarget filtering with random finite sets , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[21]  Cristian Sminchisescu,et al.  Conditional models for contextual human motion recognition , 2006, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[22]  S. Godsill,et al.  Monte Carlo filtering for multi target tracking and data association , 2005, IEEE Transactions on Aerospace and Electronic Systems.

[23]  Ramakant Nevatia,et al.  Recognition and Segmentation of 3-D Human Action Using HMM and Multi-class AdaBoost , 2006, ECCV.

[24]  Mohammed Bennamoun,et al.  Three-Dimensional Model-Based Object Recognition and Segmentation in Cluttered Scenes , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Daniel H. Grollman,et al.  Dogged Learning for Robots , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[26]  Jared Jackson Microsoft robotics studio: A technical introduction , 2007, IEEE Robotics & Automation Magazine.

[27]  L. P. Kaelbling,et al.  Learning Symbolic Models of Stochastic Domains , 2007, J. Artif. Intell. Res..

[28]  Mubarak Shah,et al.  A 3-dimensional sift descriptor and its application to action recognition , 2007, ACM Multimedia.

[29]  Thomas J. Walsh,et al.  Knows what it knows: a framework for self-aware learning , 2008, ICML '08.

[30]  Cordelia Schmid,et al.  Learning realistic human actions from movies , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Andre Cohen,et al.  An object-oriented representation for efficient reinforcement learning , 2008, ICML '08.

[32]  Luigi Villani,et al.  Force Control , 2021, Springer Handbook of Robotics, 2nd Ed..

[33]  Alois Knoll,et al.  Joint-action for humans and industrial robots for assembly tasks , 2008, RO-MAN 2008 - The 17th IEEE International Symposium on Robot and Human Interactive Communication.

[34]  Anthony G. Cohn,et al.  Learning Functional Object-Categories from a Relational Spatio-Temporal Representation , 2008, ECAI.

[35]  Stefan Schaal,et al.  Robot Programming by Demonstration , 2009, Springer Handbook of Robotics.

[36]  Peter Stone,et al.  Interactively shaping agents via human reinforcement: the TAMER framework , 2009, K-CAP '09.

[37]  Larry S. Davis,et al.  Observing Human-Object Interactions: Using Spatial and Functional Compatibility for Recognition , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  Brett Browning,et al.  A survey of robot learning from demonstration , 2009, Robotics Auton. Syst..

[39]  Henk Nijmeijer,et al.  Robot Programming by Demonstration , 2010, SIMPAR.

[40]  Thomas J. Walsh,et al.  Exploring compact reinforcement-learning representations with linear regression , 2009, UAI.

[41]  Ole Madsen,et al.  The mobile robot “Little Helper”: Concepts, ideas and working principles , 2009, 2009 IEEE Conference on Emerging Technologies & Factory Automation.

[42]  Manuela M. Veloso,et al.  Interactive Policy Learning through Confidence-Based Autonomy , 2014, J. Artif. Intell. Res..

[43]  Bernhard Nebel,et al.  Coming up With Good Excuses: What to do When no Plan Can be Found , 2010, Cognitive Robotics.

[44]  Thomas J. Walsh,et al.  Generalizing Apprenticeship Learning across Hypothesis Classes , 2010, ICML.

[45]  Peter K. Allen,et al.  Robot learning of everyday object manipulations via human demonstration , 2010, 2010 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[46]  Li Wang,et al.  Human Action Segmentation and Recognition Using Discriminative Semi-Markov Models , 2011, International Journal of Computer Vision.

[47]  Thomas J. Walsh,et al.  Efficient learning of relational models for sequential decision making , 2010 .

[48]  Céline Rouveirol,et al.  Incremental Learning of Relational Action Models in Noisy Environments , 2010, ILP.

[49]  Eren Erdal Aksoy,et al.  Learning the semantics of object–action relations by observation , 2011, Int. J. Robotics Res..

[50]  Stefan Schaal,et al.  Online movement adaptation based on previous sensor experiences , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[51]  Carme Torras,et al.  Integrating Task Planning and Interactive Learning for Robots to Work in Human Environments , 2011, IJCAI.

[52]  Fernando De la Torre,et al.  Joint segmentation and classification of human actions in video , 2011, CVPR 2011.

[53]  Mark Steedman,et al.  Object-Action Complexes: Grounded abstractions of sensory-motor processes , 2011, Robotics Auton. Syst..

[54]  Danica Kragic,et al.  Visual object-action recognition: Inferring object affordances from human demonstration , 2011, Comput. Vis. Image Underst..

[55]  Mausam,et al.  LRTDP Versus UCT for Online Probabilistic Planning , 2012, AAAI.

[56]  Andrea Torsello,et al.  A Scale Independent Selection Process for 3D Object Recognition in Cluttered Scenes , 2013, International Journal of Computer Vision.

[57]  J. Roßmann,et al.  VALIDATING THE CAMERA AND LIGHT SIMULATION OF A VIRTUAL SPACE ROBOTICS TESTBED BY MEANS OF PHYSICAL MOCKUP DATA , 2012 .

[58]  Markus Vincze,et al.  A Global Hypotheses Verification Method for 3D Object Recognition , 2012, ECCV.

[59]  Marc Toussaint,et al.  Exploration in relational domains for model-based reinforcement learning , 2012, J. Mach. Learn. Res..

[60]  Manuela M. Veloso,et al.  Multi-resolution Corrective Demonstration for Efficient Task Execution and Refinement , 2012, Int. J. Soc. Robotics.

[61]  Surya P. N. Singh,et al.  V-REP: A versatile and scalable robot simulation framework , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[62]  Jos Elfring,et al.  Semantic world modeling using probabilistic multiple hypothesis anchoring , 2013, Robotics Auton. Syst..

[63]  Byoung-Tak Zhang,et al.  Enhancing human action recognition through spatio-temporal feature learning and semantic rules , 2013, 2013 13th IEEE-RAS International Conference on Humanoid Robots (Humanoids).

[64]  Yiannis Aloimonos,et al.  Detection of Manipulation Action Consequences (MAC) , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[65]  Katsumi Inoue,et al.  Learning revised models for planning in adaptive systems , 2013, 2013 35th International Conference on Software Engineering (ICSE).

[66]  Henrik Gordon Petersen,et al.  Pose estimation using local structure-specific shape and appearance context , 2013, 2013 IEEE International Conference on Robotics and Automation.

[67]  Manuel Lopes,et al.  Active Learning for Teaching a Robot Grounded Relational Symbols , 2013, IJCAI.

[68]  Jürgen Roßmann,et al.  Advanced 3D Simulation Technology for eRobotics: Techniques, Trends, and Chances , 2013, 2013 Sixth International Conference on Developments in eSystems Engineering.

[69]  Eren Erdal Aksoy,et al.  Point cloud video object segmentation using a persistent supervoxel world-model , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[70]  Kimitoshi Yamazaki,et al.  Manipulation of multiple objects in close proximity based on visual hierarchical relationships , 2013, 2013 IEEE International Conference on Robotics and Automation.

[71]  Jun Nakanishi,et al.  Dynamical Movement Primitives: Learning Attractor Models for Motor Behaviors , 2013, Neural Computation.

[72]  Cyrill Stachniss,et al.  Learning manipulation actions from a few demonstrations , 2013, 2013 IEEE International Conference on Robotics and Automation.

[73]  Daniele Nardi,et al.  Knowledge acquisition through human–robot multimodal interaction , 2013, Intell. Serv. Robotics.

[74]  Hema Swetha Koppula,et al.  Learning human activities and object affordances from RGB-D videos , 2012, Int. J. Robotics Res..

[75]  Eren Erdal Aksoy,et al.  A new benchmark for pose estimation with ground truth from virtual reality , 2014, Prod. Eng..

[76]  Jürgen Roßmann,et al.  Mental Models for Intelligent Systems: eRobotics Enables New Approaches to Simulation-Based AI , 2014, KI - Künstliche Intelligenz.

[77]  Eren Erdal Aksoy,et al.  Manipulation monitoring and robot intervention in complex manipulation sequences , 2014, RSS 2014.

[78]  Jun Morimoto,et al.  Orientation in Cartesian space dynamic movement primitives , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[79]  Federico Tombari,et al.  SHOT: Unique signatures of histograms for surface and texture description , 2014, Comput. Vis. Image Underst..

[80]  Stanley T. Birchfield,et al.  Program synthesis by examples for object repositioning tasks , 2014, 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[81]  Mohammed Bennamoun,et al.  3D Object Recognition in Cluttered Scenes with Local Surface Features: A Survey , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[82]  Carme Torras,et al.  Active learning of manipulation sequences , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[83]  Manuel Lopes,et al.  Robot programming from demonstration, feedback and transfer , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[84]  Eren Erdal Aksoy,et al.  An Online Vision System for Understanding Complex Assembly Tasks , 2015, VISAPP.

[85]  Eren Erdal Aksoy,et al.  Model-free incremental learning of the semantics of manipulation actions , 2015, Robotics Auton. Syst..

[86]  Carme Torras,et al.  Safe robot execution in model-based reinforcement learning , 2015, 2015 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[87]  Eren Erdal Aksoy,et al.  An Online Vision System for Understanding Complex Assembly Tasks , 2015, VISAPP 2015.

[88]  Florentin Wörgötter,et al.  Spatially Stratified Correspondence Sampling for Real-Time Point Cloud Tracking , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[89]  Jimmy A. Jørgensen,et al.  Adaptation of manipulation skills in physical contact with the environment to reference force profiles , 2015, Auton. Robots.

[90]  Henrik Gordon Petersen,et al.  Industrial Assembly Cases , 2016 .

[91]  Eren Erdal Aksoy,et al.  Enriched manipulation action semantics for robot execution of time constrained tasks , 2016, 2016 IEEE-RAS 16th International Conference on Humanoid Robots (Humanoids).

[92]  Carme Torras,et al.  Relational reinforcement learning with guided demonstrations , 2017, Artif. Intell..

[93]  Eren Erdal Aksoy,et al.  Semantic analysis of manipulation actions using spatial relations , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).