论文信息 - Integration of CNN into a Robotic Architecture to Build Semantic Maps of Indoor Environments

Integration of CNN into a Robotic Architecture to Build Semantic Maps of Indoor Environments

In robotics, semantic mapping refers to the construction of a rich representation of the environment that includes high level information needed by the robot to accomplish its tasks. Building a semantic map requires algorithms to process sensor data at different levels: geometric, topological and object detections/categories, which must be integrated into an unified model. This paper describes a robotic architecture that successfully builds such semantic maps for indoor environments. For this purpose, within a ROS-based ecosystem, we apply a state-of-the-art Convolutional Neural Network (CNN), concretely YOLOv3, for detecting objects in images. The detection results are placed within a geometric map of the environment making use of a number of modules of the architecture: robot localization, camera extrinsic calibration, data form a depth camera, etc. We demonstrate the suitability of the proposed framework by building semantic maps of several home environments from the Robot@Home dataset, using Unity 3D as a tool to visualize the maps as well as to provide future robotic developments.

[1] José-Raúl Ruiz-Sarmiento,et al. Technical improvements of the Giraff telepresence robot based on users' evaluation , 2012, 2012 IEEE RO-MAN: The 21st IEEE International Symposium on Robot and Human Interactive Communication.

[2] Wolfram Burgard,et al. Conceptual spatial representations for indoor mobile robots , 2008, Robotics Auton. Syst..

[3] Iasonas Kokkinos,et al. Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs , 2014, ICLR.

[4] Dejan Pangercic,et al. Semantic Object Maps for robotic housework - representation, acquisition and use , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[5] Yi Li,et al. R-FCN: Object Detection via Region-based Fully Convolutional Networks , 2016, NIPS.

[6] Nicolas Pinto,et al. Why is Real-World Visual Object Recognition Hard? , 2008, PLoS Comput. Biol..

[7] Luc Van Gool,et al. The Pascal Visual Object Classes Challenge: A Retrospective , 2014, International Journal of Computer Vision.

[8] Dominique Martinez,et al. A Model of Stimulus-Specific Neural Assemblies in the Insect Antennal Lobe , 2008, PLoS Comput. Biol..

[9] Dieter Fox,et al. KLD-Sampling: Adaptive Particle Filters , 2001, NIPS.

[10] Pietro Perona,et al. Microsoft COCO: Common Objects in Context , 2014, ECCV.

[11] Kaiming He,et al. Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12] José-Raúl Ruiz-Sarmiento,et al. A Semantic-Based Gas Source Localization with a Mobile Robot Combining Vision and Chemical Sensing , 2018, Sensors.

[13] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[14] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Gi Hyun Lim,et al. Interactive Open-Ended Learning for 3D Object Recognition: An Approach and Experiments , 2015, J. Intell. Robotic Syst..

[16] P. Mojiri Forooshani,et al. From ROS to unity: Leveraging robot and virtual environment middleware for immersive teleoperation , 2014, 2014 IEEE International Conference on Information and Automation (ICIA).

[17] Antonios Gasteratos,et al. Semantic mapping for mobile robotics tasks: A survey , 2015, Robotics Auton. Syst..

[18] José-Raúl Ruiz-Sarmiento,et al. Scene object recognition for mobile robots through Semantic Knowledge and Probabilistic Graphical Models , 2015, Expert Syst. Appl..

[19] Joachim Hertzberg,et al. Context-aware 3D object anchoring for mobile robots , 2018, Robotics Auton. Syst..

[20] José García Rodríguez,et al. A survey on deep learning techniques for image and video semantic segmentation , 2018, Appl. Soft Comput..

[21] Patric Jensfelt,et al. Large-scale semantic mapping and reasoning with heterogeneous modalities , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22] José-Raúl Ruiz-Sarmiento,et al. Ontology-based conditional random fields for object recognition , 2019, Knowl. Based Syst..

[23] Morgan Quigley,et al. ROS: an open-source Robot Operating System , 2009, ICRA 2009.

[24] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[25] Ali Farhadi,et al. YOLO9000: Better, Faster, Stronger , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Wei Meng,et al. ROSUnitySim: Development and experimentation of a real-time simulator for multi-unmanned aerial vehicle local planning , 2016, Simul..

[27] José-Raúl Ruiz-Sarmiento,et al. A survey on learning approaches for Undirected Graphical Models. Application to scene object recognition , 2017, Int. J. Approx. Reason..

[28] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[29] José-Raúl Ruiz-Sarmiento,et al. Robot@Home, a robotic dataset for semantic mapping of home environments , 2017, Int. J. Robotics Res..

[30] José-Raúl Ruiz-Sarmiento,et al. Building Multiversal Semantic Maps for Mobile Robot Operation , 2017, Knowl. Based Syst..

[31] Dong Xu,et al. Advanced Deep-Learning Techniques for Salient and Category-Specific Object Detection: A Survey , 2018, IEEE Signal Processing Magazine.

[32] James Philbin,et al. FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[33] Cipriano Galindo,et al. UPGMpp: a Software Library for Contextual Object Recognition , 2015 .

[34] Frank Salim,et al. The Definitive Guide to HTML5 WebSocket , 2013, Apress.