DT-SLAM: Deferred Triangulation for Robust SLAM

Obtaining a good baseline between different video frames is one of the key elements in vision-based monocular SLAM systems. However, if the video frames contain only a few 2D feature correspondences with a good baseline, or the camera only rotates without sufficient translation in the beginning, tracking and mapping becomes unstable. We introduce a real-time visual SLAM system that incrementally tracks individual 2D features, and estimates camera pose by using matched 2D features, regardless of the length of the baseline. Triangulating 2D features into 3D points is deferred until key frames with sufficient baseline for the features are available. Our method can also deal with pure rotational motions, and fuse the two types of measurements in a bundle adjustment step. Adaptive criteria for key frame selection are also introduced for efficient optimization and dealing with multiple maps. We demonstrate that our SLAM system improves camera pose estimates and robustness, even with purely rotational motions.

[1]  Ju Shen,et al.  Layer Depth Denoising and Completion for Structured-Light RGB-D Cameras , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Andrew J. Davison,et al.  DTAM: Dense tracking and mapping in real-time , 2011, 2011 International Conference on Computer Vision.

[3]  Frédo Durand,et al.  Unstructured Light Fields , 2012, Comput. Graph. Forum.

[4]  Hauke Strasdat,et al.  Scale Drift-Aware Large Scale Monocular SLAM , 2010, Robotics: Science and Systems.

[5]  Hugh Durrant-Whyte,et al.  Simultaneous localization and mapping (SLAM): part II , 2006 .

[6]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[7]  David W. Murray,et al.  Improving the Agility of Keyframe-Based SLAM , 2008, ECCV.

[8]  Tom Drummond,et al.  Fusing points and lines for high performance tracking , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[9]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Andrew W. Fitzgibbon,et al.  KinectFusion: real-time 3D reconstruction and interaction using a moving depth camera , 2011, UIST.

[11]  Tobias Höllerer,et al.  Model Estimation and Selection towardsUnconstrained Real-Time Tracking and Mapping , 2014, IEEE Transactions on Visualization and Computer Graphics.

[12]  Tobias Höllerer,et al.  Live tracking and mapping from both general and rotation-only camera motion , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[13]  Javier Civera,et al.  Inverse Depth to Depth Conversion for Monocular SLAM , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[14]  Otmar Loffeld,et al.  Depth Camera Technology Comparison and Performance Evaluation , 2012, ICPRAM.

[15]  Jean-Philippe Pons,et al.  High Accuracy and Visibility-Consistent Dense Multiview Stereo , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  David W. Murray,et al.  Wide-area augmented reality using camera tracking and mapping in multiple regions , 2011, Comput. Vis. Image Underst..

[17]  Kyoung Mu Lee,et al.  Monocular SLAM with locally planar landmarks via geometric rao-blackwellized particle filtering on Lie groups , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Hauke Strasdat,et al.  Real-time monocular SLAM: Why filter? , 2010, 2010 IEEE International Conference on Robotics and Automation.

[19]  Tobias Höllerer,et al.  The City of Sights: Design, construction, and measurement of an Augmented Reality stage set , 2010, 2010 IEEE International Symposium on Mixed and Augmented Reality.

[20]  Jiri Matas,et al.  Epipolar geometry estimation via RANSAC benefits from the oriented epipolar constraint , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[21]  Dieter Schmalstieg,et al.  Handling pure camera rotation in keyframe-based SLAM , 2013, 2013 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[22]  Hujun Bao,et al.  Keyframe-based real-time camera tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Tobias Höllerer,et al.  Model Estimation and Selection towards Unconstrained Real-Time Tracking and Mapping. , 2013, IEEE transactions on visualization and computer graphics.

[24]  Jianliang Tang,et al.  Complete Solution Classification for the Perspective-Three-Point Problem , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[25]  Ian D. Reid,et al.  A Constant-Time Efficient Stereo SLAM System , 2009, BMVC.

[26]  Chen Feng,et al.  SLAM using both points and planes for hand-held 3D sensors , 2012, 2012 IEEE International Symposium on Mixed and Augmented Reality (ISMAR).

[27]  David W. Murray,et al.  Parallel Tracking and Mapping on a camera phone , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[28]  Jiawen Chen,et al.  Scalable real-time volumetric surface reconstruction , 2013, ACM Trans. Graph..

[29]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.