Towards Utilizing Deep Uncertainty In Traditional SLAM

Recent advances in Simultaneous Localization and Mapping (SLAM) systems make robots behave intelligently in different situations. However, performing well in dynamic environments requires extraction of robust features from the static context of consecutive frames. Deep Learning, nowadays, helps to extract meaningful and robust feature representations. Using various Deep Learning architectures and retrieving the uncertainty information concerning dynamic objects of the scene, and the motion flow of the environment helps traditional SLAM extract robust features in dynamic environments. To our knowledge, this paper is the first paper to fuse various deep uncertainties in traditional SLAM to obtain robust features in dynamic environments and improve the accuracy of motion estimation for autonomous vehicles.

[1]  Keyu Wu,et al.  From Local Understanding to Global Regression in Monocular Visual Odometry , 2020, Int. J. Pattern Recognit. Artif. Intell..

[2]  Yakup Genc,et al.  GPU-based Video Feature Tracking And Matching , 2006 .

[3]  Sen Wang,et al.  VINet: Visual-Inertial Odometry as a Sequence-to-Sequence Learning Problem , 2017, AAAI.

[4]  Thomas Brox,et al.  High Accuracy Optical Flow Estimation Based on a Theory for Warping , 2004, ECCV.

[5]  Juan D. Tardós,et al.  ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras , 2016, IEEE Transactions on Robotics.

[6]  Andreas Geiger,et al.  Vision meets robotics: The KITTI dataset , 2013, Int. J. Robotics Res..

[7]  Ian D. Reid,et al.  Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[8]  Torsten Sattler,et al.  VSO: Visual Semantic Odometry , 2018, ECCV.

[9]  Rob Fergus,et al.  Depth Map Prediction from a Single Image using a Multi-Scale Deep Network , 2014, NIPS.

[10]  Shaojie Shen,et al.  Monocular Visual-Inertial State Estimation for Mobile Augmented Reality , 2017, 2017 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[11]  Anelia Angelova,et al.  Depth Prediction Without the Sensors: Leveraging Structure for Unsupervised Learning from Monocular Videos , 2018, AAAI.

[12]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[13]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[15]  Javier Civera,et al.  DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes , 2018, IEEE Robotics and Automation Letters.

[16]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[17]  Hamid Reza Pourreza,et al.  Kinect Depth Recovery Based on Local Filters and Plane Primitives , 2017 .

[18]  Han Wang,et al.  A New Approach to Train Convolutional Neural Networks for Real-Time 6-DOF Camera Relocalization , 2018, 2018 IEEE 14th International Conference on Control and Automation (ICCA).

[19]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[20]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[21]  Davide Scaramuzza,et al.  SVO: Fast semi-direct monocular visual odometry , 2014, 2014 IEEE International Conference on Robotics and Automation (ICRA).

[22]  Thomas Brox,et al.  FlowNet 2.0: Evolution of Optical Flow Estimation with Deep Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Joachim Hertzberg,et al.  6D SLAM—3D mapping outdoor environments , 2007, J. Field Robotics.