Large-scale direct SLAM with stereo cameras

We propose a novel Large-Scale Direct SLAM algorithm for stereo cameras (Stereo LSD-SLAM) that runs in real-time at high frame rate on standard CPUs. In contrast to sparse interest-point based methods, our approach aligns images directly based on the photoconsistency of all high-contrast pixels, including corners, edges and high texture areas. It concurrently estimates the depth at these pixels from two types of stereo cues: Static stereo through the fixed-baseline stereo camera setup as well as temporal multi-view stereo exploiting the camera motion. By incorporating both disparity sources, our algorithm can even estimate depth of pixels that are under-constrained when only using fixed-baseline stereo. Using a fixed baseline, on the other hand, avoids scale-drift that typically occurs in pure monocular SLAM.We furthermore propose a robust approach to enforce illumination invariance, capable of handling aggressive brightness changes between frames - greatly improving the performance in realistic settings. In experiments, we demonstrate state-of-the-art results on stereo SLAM benchmarks such as Kitti or challenging datasets from the EuRoC Challenge 3 for micro aerial vehicles.

[1]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[2]  Hauke Strasdat,et al.  Scale Drift-Aware Large Scale Monocular SLAM , 2010, Robotics: Science and Systems.

[3]  Alois Knoll,et al.  Efficient compositional approaches for real-time robust direct visual odometry from RGB-D data , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[4]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Tommi Tykkala,et al.  A dense structure model for image based stereo SLAM , 2011, 2011 IEEE International Conference on Robotics and Automation.

[6]  W MurrayDavid,et al.  Simultaneous Localization and Map-Building Using Active Vision , 2002 .

[7]  Andrew I. Comport,et al.  Real-time direct tracking of color images in the presence of illumination variation , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8]  Michal Irani,et al.  All About Direct Methods , 1999 .

[9]  Patrick Rives,et al.  Accurate Quadrifocal Tracking for Robust 3D Visual Odometry , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10]  Stefano Soatto,et al.  Structure from Motion Causally Integrated Over Time , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[12]  David W. Murray,et al.  Simultaneous Localization and Map-Building Using Active Vision , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Daniel Cremers,et al.  LSD-SLAM: Large-Scale Direct Monocular SLAM , 2014, ECCV.

[14]  Daniel Cremers,et al.  Semi-dense Visual Odometry for a Monocular Camera , 2013, 2013 IEEE International Conference on Computer Vision.

[15]  Andrew I. Comport,et al.  On unifying key-frame and voxel-based dense visual SLAM at large scales , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[16]  Daniel Cremers,et al.  Robust odometry estimation for RGB-D cameras , 2013, 2013 IEEE International Conference on Robotics and Automation.

[17]  Paul Newman,et al.  Appearance-only SLAM at large scale with FAB-MAP 2.0 , 2011, Int. J. Robotics Res..

[18]  Daniel Cremers,et al.  Dense visual SLAM for RGB-D cameras , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[19]  Hujun Bao,et al.  Simultaneous multi-body stereo and segmentation , 2011, 2011 International Conference on Computer Vision.

[20]  Jörg Stückler,et al.  Efficient Dense Rigid-Body Motion Segmentation and Estimation in RGB-D Video , 2015, International Journal of Computer Vision.

[21]  Lina María Paz,et al.  Large-Scale 6-DOF SLAM With Stereo-in-Hand , 2008, IEEE Transactions on Robotics.