Visual Perception Framework for an Intelligent Mobile Robot

Visual perception is a fundamental capability necessary for intelligent mobile robots to interact properly and safely with the humans in the real-world. Recently, the world has seen revolutionary advances in deep learning has led to some incredible breakthroughs in vision technology. However, research integrating diverse visual perception methods into robotic systems is still in its infancy and lacks validation in real-world scenarios. In this paper, we present a visual perception framework for an intelligent mobile robot. Based on the robot operating system middleware, our framework integrates a broad set of advanced algorithms capable of recognising people, objects and human poses, as well as describing observed scenes. In several challenge scenarios of international robotics competitions using two mobile service robots, the performance and acceptability of the proposed framework are evaluated.

[1]  Qi Tian,et al.  Scalable Person Re-identification: A Benchmark , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2]  Michael Jones,et al.  An improved deep learning architecture for person re-identification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Ali Farhadi,et al.  YOLOv3: An Incremental Improvement , 2018, ArXiv.

[4]  Ross B. Girshick,et al.  Mask R-CNN , 2017, 1703.06870.

[5]  Varun Ramakrishna,et al.  Convolutional Pose Machines , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Illah R. Nourbakhsh,et al.  A survey of socially interactive robots , 2003, Robotics Auton. Syst..

[7]  Ying Chen,et al.  M2Det: A Single-Shot Object Detector based on Multi-Level Feature Pyramid Network , 2018, AAAI.

[8]  Li Fei-Fei,et al.  DenseCap: Fully Convolutional Localization Networks for Dense Captioning , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Pietro Perona,et al.  Microsoft COCO: Common Objects in Context , 2014, ECCV.

[10]  Fuminori Saito,et al.  Development of the Research Platform of a Domestic Mobile Manipulator Utilized for International Competition and Field Test , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Yaser Sheikh,et al.  OpenPose: Realtime Multi-Person 2D Pose Estimation Using Part Affinity Fields , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Byoung-Tak Zhang,et al.  Spatial Perception by Object-Aware Visual Scene Representation , 2019, 2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW).

[13]  Gérard G. Medioni,et al.  Robust real-time vision for a personal service robot , 2007, Comput. Vis. Image Underst..

[14]  Quoc V. Le,et al.  EfficientDet: Scalable and Efficient Object Detection , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[15]  Seung-Joon Yi,et al.  Mobile Manipulation for the HSR Intelligent Home Service Robot , 2019, 2019 16th International Conference on Ubiquitous Robots (UR).

[16]  Samy Bengio,et al.  Show and tell: A neural image caption generator , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).