CNN-based Monocular Decentralized SLAM on embedded FPGA

Decentralized visual simultaneous localization and mapping (DSLAM) can share locations and environmental information between robots, which is an essential task for many multi-robot applications. The visual odometry (VO) is a basic component to estimate the 6-DoF absolute pose for robot applications. Decentralized place recognition (DPR) is a fundamental element to produce candidate place matches for sharing information among different robots. The goal of this paper is to build a CNN-based real-time DSLAM system on embedded FPGA platforms. Because of the high precision requirement of VO, the existing quantization methods can not be directly applied. We improve the fixed-point fine-tune method for the CNN-based monocular VO, which enables VO can be deployed on the fixed-point FPGA accelerator. We also explore the influence of the DPR frequency on the DSLAM results, and find out a proper DPR frequency to balance the accuracy and speed. A cross-component pipeline scheduling method is proposed to improve DPR frequency and further improve the final accuracy of DSLAM under the same hardware resource constraints.

[1]  Yu Wang,et al.  Going Deeper with Embedded FPGA Platform for Convolutional Neural Network , 2016, FPGA.

[2]  Tomás Pajdla,et al.  NetVLAD: CNN Architecture for Weakly Supervised Place Recognition , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Frank Dellaert,et al.  Distributed mapping with privacy and communication constraints: Lightweight algorithms and object-based models , 2017, Int. J. Robotics Res..

[4]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[5]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[6]  Titus Cieslewski,et al.  Data-Efficient Decentralized Visual SLAM , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[7]  Yiran Chen,et al.  eSLAM: An Energy-Efficient Accelerator for Real-Time ORB-SLAM on FPGA Platform* , 2019, 2019 56th ACM/IEEE Design Automation Conference (DAC).

[8]  Yu Wang,et al.  Instruction Driven Cross-layer CNN Accelerator for Fast Detection on FPGA , 2018, ACM Trans. Reconfigurable Technol. Syst..

[9]  Yan Su,et al.  Graph-Based Place Recognition in Image Sequences with CNN Features , 2018, Journal of Intelligent & Robotic Systems.

[10]  Roland Siegwart,et al.  Robust Visual Place Recognition with Graph Kernels , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[11]  Ian D. Reid,et al.  Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.