Deep Monocular Visual Odometry for Ground Vehicle

Monocular visual odometry, with the ability to help robots to locate themselves in unexplored environments, has been a crucial research problem in robotics. Though the existed learning-based end-to-end methods can reduce engineering efforts such as accurate camera calibration and tedious case-by-case parameter tuning, the accuracy is still limited. One of the main reasons is that previous works aim to learn six-degrees-of-freedom motions despite the constrained motion of a ground vehicle by its mechanical structure and dynamics. To push the limit, we analyze the motion pattern of a ground vehicle and focus on learning two-degrees-of-freedom motions by proposed motion focusing and decoupling. The experiments on KITTI dataset show that the proposed motion focusing and decoupling approach can improve the visual odometry performance by reducing the relative pose error. Moreover, with the dimension reduction of the learning objective, our network is much lighter with only four convolution layers, which can quickly converge during the training stage and run in real-time at over 200 frames per second during the testing stage.

[1]  Fabio Tozeto Ramos,et al.  Semi-parametric models for visual odometry , 2012, 2012 IEEE International Conference on Robotics and Automation.

[2]  Zhichao Yin,et al.  GeoNet: Unsupervised Learning of Dense Depth, Optical Flow and Camera Pose , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3]  Olga Barinova,et al.  Training Deep SLAM on Single Frames , 2019, ArXiv.

[4]  Paul Newman,et al.  1 year, 1000 km: The Oxford RobotCar dataset , 2017, Int. J. Robotics Res..

[5]  Clark C. Guest,et al.  High Accuracy Monocular SFM and Scale Correction for Autonomous Driving , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Ian D. Reid,et al.  Unsupervised Learning of Monocular Depth Estimation and Visual Odometry with Deep Feature Reconstruction , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7]  Jonathan Kelly,et al.  How to Train a CAT: Learning Canonical Appearance Transformations for Direct Visual Localization Under Illumination Change , 2017, IEEE Robotics and Automation Letters.

[8]  Daniel Cremers,et al.  Direct Sparse Odometry , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Gabriele Costante,et al.  LS-VO: Learning Dense Optical Subspace for Robust Visual Odometry Estimation , 2017, IEEE Robotics and Automation Letters.

[10]  John J. Leonard,et al.  Towards visual ego-motion learning in robots , 2017, 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[11]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[12]  Sunglok Choi,et al.  Simplified epipolar geometry for real-time monocular visual odometry on roads , 2015 .

[13]  Sebastian Ramos,et al.  The Cityscapes Dataset for Semantic Urban Scene Understanding , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Sen Wang,et al.  End-to-end, sequence-to-sequence probabilistic visual odometry through deep neural networks , 2018, Int. J. Robotics Res..

[15]  Dongbing Gu,et al.  UnDeepVO: Monocular Visual Odometry Through Unsupervised Deep Learning , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[16]  Nan Yang,et al.  D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Roland Siegwart,et al.  Introduction to Autonomous Mobile Robots , 2004 .

[18]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[19]  Ashish Kapoor,et al.  TartanAir: A Dataset to Push the Limits of Visual SLAM , 2020, ArXiv.

[20]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[21]  Yasin Almalioglu,et al.  GANVO: Unsupervised Deep Monocular Visual Odometry and Depth Estimation with Generative Adversarial Networks , 2018, 2019 International Conference on Robotics and Automation (ICRA).

[22]  Anelia Angelova,et al.  Unsupervised Learning of Depth and Ego-Motion from Monocular Video Using 3D Geometric Constraints , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[23]  Paolo Valigi,et al.  Exploring Representation Learning With CNNs for Frame-to-Frame Ego-Motion Estimation , 2016, IEEE Robotics and Automation Letters.

[24]  Tucker R. Balch,et al.  Memory-based learning for visual odometry , 2008, 2008 IEEE International Conference on Robotics and Automation.

[25]  Kaiming He,et al.  Group Normalization , 2018, ECCV.

[26]  Hongdong Li,et al.  Reliable scale estimation and correction for monocular Visual Odometry , 2016, 2016 IEEE Intelligent Vehicles Symposium (IV).

[27]  Sen Wang,et al.  DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[28]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  J. M. M. Montiel,et al.  ORB-SLAM: A Versatile and Accurate Monocular SLAM System , 2015, IEEE Transactions on Robotics.

[30]  Andrea Vedaldi,et al.  Supervising the New with the Old: Learning SFM from SFM , 2018, ECCV.

[31]  Noah Snavely,et al.  Unsupervised Learning of Depth and Ego-Motion from Video , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[32]  Shuzhi Sam Ge,et al.  Road Constrained Monocular Visual Localization Using Gaussian-Gaussian Cloud Model , 2017, IEEE Transactions on Intelligent Transportation Systems.

[33]  Vladlen Koltun,et al.  Multi-Scale Context Aggregation by Dilated Convolutions , 2015, ICLR.

[34]  Hui Zhang,et al.  Monocular Visual Odometry Scale Recovery Using Geometrical Constraint , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[35]  Roland Siegwart,et al.  Real-time monocular visual odometry for on-road vehicles with 1-point RANSAC , 2009, 2009 IEEE International Conference on Robotics and Automation.

[36]  Qiang Chen,et al.  Network In Network , 2013, ICLR.

[37]  Daniel D. Lee,et al.  Online self-supervised monocular visual odometry for ground vehicles , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[38]  Roland Siegwart,et al.  Appearance-Guided Monocular Omnidirectional Visual Odometry for Outdoor Ground Vehicles , 2008, IEEE Transactions on Robotics.