Faster R-CNN Learning-Based Semantic Filter for Geometry Estimation and Its Application in vSLAM Systems
暂无分享,去创建一个
Epipolar geometry is a fundamental constraint used in computer vision systems to estimate parameters using correspondence. The most common way to describe epipolar geometry is by means of a 3x3 matrix called the fundamental matrix, and such matrices are used to store the precise geometric information relating a pair of stereo images. Its efficient estimation substantially improves initialization in visual simultaneous localization and mapping (vSLAM), which uses correspondence-based epipolar geometry to determine the trajectory of the camera and a three-dimensional scene. Conventional robust methods for epipolar geometry estimation can become computationally inefficient and inaccurate when there are low-quality correspondences. Because semantic information can be more stable than pixel intensities/descriptors, a novel Faster Region-based Convolutional Network (R-CNN) learning-based approach called the semantic filter is proposed in this paper to address these problems. The semantic filter is first trained on different semantic patches, which are described in terms of their different outlier distributions, providing different semantic labels for image contexts. Then, the patches with low-level semantic labels are filtered out. Finally, precise and robust correspondences can be determined by matches using the high-level semantic contexts, making the correspondence-based calculation more accurate. For dynamic outdoor scenes, the results of extensive experiments show that our semantic filter can help vSLAM localize accurately and robustly on a map from different viewpoints. In a completely static scenario, our semantic filter can remove the low-quality correspondences, enabling the mobile robot to operate well.