Self-Teaching Strategy for Learning to Recognize Novel Objects in Collaborative Robots

Collaborative robot (cobot) is designed to be deployed to different tasks flexibly. For a new task, it is necessary to train the cobot to detect and recognize novel objects. Using dominant object detector based on Faster R-CNN, a user has to train it using a large number of manually annotated samples, which is inefficient and expensive. In this paper, we propose a self-teaching strategy for a cobot to learn to recognize novel objects efficiently and effectively. Like human-to-human teaching, the user just provides a few examples of a novel object captured by an RGB-D camera. The cobot obtains the ground truth annotation of the object automatically through depth segmentation. To achieve robust performance of object detection in real-world scenes, it generates augmented training samples by virtually placing the object in various backgrounds with changing scales and orientations (2D augmentation), and variations of viewpoints through projective transformation (3D augmentation). A state-of-the-art Faster R-CNN is re-trained and evaluated on real-world scenarios for a task of gearbox assembly. The comparison with conventional training approaches shows the superiority of the proposed approach in terms of efficiency and robustness for novel object detection.

[1]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[2]  Bingbing Ni,et al.  Scale-Transferrable Object Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Mohit Shridhar,et al.  Interactive Visual Grounding of Referring Expressions for Human-Robot Interaction , 2018, Robotics: Science and Systems.

[5]  Cordelia Schmid,et al.  Multi-fold MIL Training for Weakly Supervised Object Localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[6]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  Chong Wang,et al.  Large-Scale Weakly Supervised Object Localization via Latent Category Learning , 2015, IEEE Transactions on Image Processing.

[8]  Fabio Maria Carlucci,et al.  Bridging between Computer and Robot Vision through Data Augmentation: a Case Study on Object Recognition , 2017, ICVS.

[9]  Vijay Vasudevan,et al.  Learning Transferable Architectures for Scalable Image Recognition , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[10]  W. Beyer CRC Standard Mathematical Tables and Formulae , 1991 .

[11]  Stephanie Rosenthal,et al.  CoBots: Robust Symbiotic Autonomous Mobile Service Robots , 2015, IJCAI.

[12]  Luc Van Gool,et al.  Interactive object detection , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Frank Keller,et al.  We Don’t Need No Bounding-Boxes: Training Object Class Detectors Using Only Human Verification , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Daniel Zwillinger,et al.  CRC standard mathematical tables and formulae; 30th edition , 1995 .

[15]  Radu Bogdan Rusu,et al.  3D is here: Point Cloud Library (PCL) , 2011, 2011 IEEE International Conference on Robotics and Automation.

[16]  Sergio Guadarrama,et al.  Speed/Accuracy Trade-Offs for Modern Convolutional Object Detectors , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Nuno Vasconcelos,et al.  Cascade R-CNN: Delving Into High Quality Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[18]  Sven Behnke,et al.  RGB-D object recognition and pose estimation based on pre-trained convolutional neural network features , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[19]  Trevor Darrell,et al.  Learning to Recognize Objects from Unseen Modalities , 2010, ECCV.

[20]  Rodrigo Ventura,et al.  Robust Object Recognition Through Symbiotic Deep Learning In Mobile Robots , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[21]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[22]  Kristen Grauman,et al.  Large-Scale Live Active Learning: Training Object Detectors with Crawled Data and Crowds , 2011, CVPR 2011.