ac2SLAM: FPGA Accelerated High-Accuracy SLAM with Heapsort and Parallel Keypoint Extractor

In order to fulfill the rich functions of the application layer, robust and accurate Simultaneous Localization and Mapping (SLAM) technique is very critical for robotics. However, due to the lack of sufficient computing power and storage capacity, it is challenging to delpoy high-accuracy SLAM in embedded devices efficiently. In this work, we propose a complete acceleration scheme, termed ac2SLAM, based on the ORB-SLAM2 algorithm including both front and back ends, and implement it on an FPGA platform. Specifically, the proposed ac2SLAM features with: 1) a scalable and parallel ORB extractor to extract sufficient keypoints and scores for throughput matching with 4% error, 2) a PingPong heapsort component (pp-heapsort) to select the significant keypoints, that could achieve single-cycle initiation interval to reduce the amount of data transfer between accelerator and the host CPU, and 3) the potential parallel acceleration strategies for the back-end optimization. Compared with running ORB-SLAM2 on the ARM processor, ac2SLAM achieves 2.1 × and 2.7 × faster in the TUM and KITTI datasets, while maintaining 10% error of SOTA eSLAM. In addition, the FPGA accelerated front-end achieves 4.55 × and 40 × faster than eSLAM and ARM. The ac2SLAM is fully open-sourced at https://github.com/SLAM-Hardware/acSLAM.