Anticipation and attention for robust object recognition with RGBD-data in an industrial application scenario

An extension based on attention and anticipation of a robot vision pipeline for object recognition in RGBD images from low-cost sensors like MS Kinect or ASUS Xtion is presented. This work originated in research on an industrial application scenario, namely shipping-container unloading, but it is applicable to advanced manipulation tasks in unstructured environments in general where the perception must be very robust while being as fast as possible. For these scenarios, we build on our previous work that proved to be competitive in cluttered scenes in table-top scenarios and which forms the backbone of our RGBD object recognition. It is further enhanced by two main contributions. First, a simple but very effective form of anticipation as top-down expectations of the evolution of the scene due to the actions of the robot is used to speed up the processing. Second, attention is used as a mechanism for further speed-up by focusing processing only on certain regions of interest of the scene based also on an anticipation mechanism. The method is analyzed in experiments using real-world data from an industrial demonstration set-up.

[1]  G. Hesslow Conscious thought as simulation of behaviour and perception , 2002, Trends in Cognitive Sciences.

[2]  Paolo Dario,et al.  Expected perception: an anticipation-based perception-action scheme in robots , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).

[3]  Christof Koch,et al.  A Model of Saliency-Based Visual Attention for Rapid Scene Analysis , 2009 .

[4]  Andreas Birk,et al.  The jacobs robotics approach to object recognition and localization in the context of the ICRA'11 Solutions in Perception Challenge , 2012, 2012 IEEE International Conference on Robotics and Automation.

[5]  J. Gibson The Ecological Approach to Visual Perception , 1979 .

[6]  Christof Koch,et al.  Modeling attention to salient proto-objects , 2006, Neural Networks.

[7]  Il Hong Suh,et al.  Integration of a prediction mechanism with a sensor model: An anticipatory Bayes filter , 2009, 2009 IEEE International Conference on Robotics and Automation.

[8]  Markus Vincze,et al.  Towards Bringing Robots into Homes , 2010, ISR/ROBOTIK.

[9]  Josep Tornero,et al.  Path planning and trajectory generation using multi-rate predictive Artificial Potential Fields , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[10]  Kazuyuki Murase,et al.  Vision-sensorimotor abstraction and imagination towards exploring robot’s inner world , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).

[11]  Andreas Birk,et al.  Robotics and Autonomous Systems , 2022 .

[12]  Dorin Comaniciu,et al.  Mean Shift: A Robust Approach Toward Feature Space Analysis , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Lucas Paletta,et al.  Geo-contextual priors for attentive urban object recognition , 2009, 2009 IEEE International Conference on Robotics and Automation.

[14]  三嶋 博之 The theory of affordances , 2008 .

[15]  Marko Tscherepanow,et al.  A saliency map based on sampling an image into random rectangular regions of interest , 2012, Pattern Recognit..

[16]  Hao Wu,et al.  Environment adapted active multi-focal vision system for object detection , 2009, 2009 IEEE International Conference on Robotics and Automation.

[17]  Eckehard G. Steinbach,et al.  Surprise-driven acquisition of visual object representations for cognitive mobile robots , 2011, 2011 IEEE International Conference on Robotics and Automation.

[18]  Liqing Zhang,et al.  Saliency Detection: A Spectral Residual Approach , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[19]  Félix Ingrand,et al.  Extending procedural reasoning toward robot actions planning , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[20]  Xuelong Li,et al.  Visual saliency detection using information divergence , 2013, Pattern Recognit..

[21]  Andreas Birk,et al.  Velvet fingers: Grasp planning and execution for an underactuated gripper with active surfaces , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Christian A. Mueller,et al.  No More Heavy Lifting: Robotic Solutions to the Container Unloading Problem , 2016, IEEE Robotics & Automation Magazine.

[23]  Horst-Michael Groß,et al.  Neural anticipative architecture for expectation driven perception , 2001, 2001 IEEE International Conference on Systems, Man and Cybernetics. e-Systems and e-Man for Cybernetics in Cyberspace (Cat.No.01CH37236).

[24]  Giulio Sandini,et al.  Visual attention priming based on crossmodal expectations , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[25]  Jannik Fritsch,et al.  "BIRON, let me show you something": evaluating the interaction with a robot companion , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[26]  Takayuki Kanda,et al.  Natural deictic communication with humanoid robots , 2007, 2007 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[27]  Hoi-Jun Yoo,et al.  A 201.4 GOPS 496 mW Real-Time Multi-Object Recognition Processor With Bio-Inspired Neural Perception Engine , 2009, IEEE Journal of Solid-State Circuits.

[28]  Donghyun Kim,et al.  A 201.4GOPS 496mW real-time multi-object recognition processor with bio-inspired neural perception engine , 2009, 2009 IEEE International Solid-State Circuits Conference - Digest of Technical Papers.

[29]  Lucas Paletta,et al.  Learning Predictive Features in Affordance based Robotic Perception Systems , 2006, 2006 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[30]  Vinicio Tincani,et al.  Implementation and control of the Velvet Fingers: A dexterous gripper with active surfaces , 2012, 2013 IEEE International Conference on Robotics and Automation.

[31]  Phillip J. McKerrow,et al.  Introduction to robotics , 1991 .

[32]  Christian A. Mueller,et al.  Dexterous Undersea Interventions with Far Distance Onshore Supervision: the DexROV Project , 2016 .

[33]  IttiLaurent,et al.  Rapid Biologically-Inspired Scene Classification Using Features Shared with Visual Attention , 2007 .

[34]  James J. Little,et al.  Informed visual search: Combining attention and object recognition , 2008, 2008 IEEE International Conference on Robotics and Automation.

[35]  Henrik I. Christensen,et al.  Cognitive vision for efficient scene processing and object categorization in highly cluttered environments , 2009, 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  C. Koch,et al.  Computational modelling of visual attention , 2001, Nature Reviews Neuroscience.

[37]  G. Dorffner,et al.  Learning to perceive affordances in a framework of developmental embodied cognition , 2007, 2007 IEEE 6th International Conference on Development and Learning.

[38]  Henrik I. Christensen,et al.  Computational visual attention systems and their cognitive foundations: A survey , 2010, TAP.

[39]  Jannik Fritsch,et al.  Interactive object learning for robot companions using mosaic images , 2005, 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[40]  Laurent Itti,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 Rapid Biologically-inspired Scene Classification Using Features Shared with Visual Attention , 2022 .

[41]  Christian A. Mueller,et al.  Object recognition and localization for robust grasping with a dexterous gripper in the context of container unloading , 2014, 2014 IEEE International Conference on Automation Science and Engineering (CASE).