论文信息 - Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks

Estimating Metric Scale Visual Odometry from Videos using 3D Convolutional Networks

We present an end-to-end deep learning approach for performing metric scale-sensitive regression tasks such visual odometry with a single camera and no additional sensors. We propose a novel 3D convolutional architecture, 3DC-VO, that can leverage temporal relationships over a short moving window of images to estimate linear and angular velocities. The network makes local predictions on stacks of images that can be integrated to form a full trajectory. We apply 3DC-VO to the KITTI visual odometry benchmark and the task of estimating a pilot’s control inputs from a first-person video of a quadrotor flight. Our method exhibits increased accuracy relative to comparable learning-based algorithms trained on monocular images. We also show promising results for quadrotor control input prediction when trained on a new dataset collected with a UAV simulator.

Gaurav S. Sukhatme | James A. Preiss | Alexander S. Koumis

[1] Zhongliang Deng,et al. MagicVO: An End-to-End Hybrid CNN and Bi-LSTM Method for Monocular Visual Odometry , 2019, IEEE Access.

[2] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Andrew L. Maas. Rectifier Nonlinearities Improve Neural Network Acoustic Models , 2013 .

[4] Bärbel Mertsching,et al. Fast Techniques for Monocular Visual Odometry , 2015, GCPR.

[5] Friedrich Fraundorfer,et al. Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[6] Uwe Stilla,et al. METRIC SCALE CALCULATION FOR VISUAL MAPPING ALGORITHMS , 2018, The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences.

[7] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[8] Swagat Kumar,et al. UnDEMoN: Unsupervised Deep Network for Depth and Ego-Motion Estimation , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[9] Stergios I. Roumeliotis,et al. A Multi-State Constraint Kalman Filter for Vision-aided Inertial Navigation , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[10] Andreas Geiger,et al. Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Shiyu Song,et al. Robust Scale Estimation in Real-Time Monocular SFM for Autonomous Driving , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[12] Paolo Valigi,et al. Evaluation of non-geometric methods for visual odometry , 2014, Robotics Auton. Syst..

[13] Vijay Kumar,et al. Minimum snap trajectory generation and control for quadrotors , 2011, 2011 IEEE International Conference on Robotics and Automation.

[14] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[15] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16] Bernard Ghanem,et al. Teaching UAVs to Race With Observational Imitation Learning , 2018, ArXiv.

[17] Roland Memisevic,et al. Learning Visual Odometry with a Convolutional Network , 2015, VISAPP.

[18] Shaojie Shen,et al. VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator , 2017, IEEE Transactions on Robotics.

[19] Taeyoung Lee,et al. Geometric tracking control of a quadrotor UAV on SE(3) , 2010, 49th IEEE Conference on Decision and Control (CDC).

[20] Jean-Bernard Hayet,et al. Bayesian Scale Estimation for Monocular SLAM Based on Generic Object Detection for Correcting Scale Drift , 2017, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[21] Fabio Tozeto Ramos,et al. Semi-parametric models for visual odometry , 2012, 2012 IEEE International Conference on Robotics and Automation.

[22] Marc Pollefeys,et al. PIXHAWK: A system for autonomous flight using onboard computer vision , 2011, 2011 IEEE International Conference on Robotics and Automation.

[23] Andreas E. Savakis,et al. Flowdometry: An Optical Flow and Deep Learning Based Approach to Visual Odometry , 2017, 2017 IEEE Winter Conference on Applications of Computer Vision (WACV).

[24] Roland Siegwart,et al. Onboard IMU and monocular vision based control for MAVs in unknown in- and outdoor environments , 2011, 2011 IEEE International Conference on Robotics and Automation.

[25] Li Sun,et al. Learning Monocular Visual Odometry with Dense 3D Mapping from Dense 3D Flow , 2018, 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).

[26] Rudolf Mester,et al. Predictive monocular odometry (PMO): What is possible without RANSAC and multiframe bundle adjustment? , 2017, Image Vis. Comput..

[27] Ashish Kapoor,et al. AirSim: High-Fidelity Visual and Physical Simulation for Autonomous Vehicles , 2017, FSR.

[28] Vladlen Koltun,et al. An Empirical Evaluation of Generic Convolutional and Recurrent Networks for Sequence Modeling , 2018, ArXiv.

[29] Olaf Kähler,et al. Object-aware bundle adjustment for correcting monocular scale drift , 2016, 2016 IEEE International Conference on Robotics and Automation (ICRA).

[30] Bärbel Mertsching,et al. On the Second Order Statistics of Essential Matrix Elements , 2014, GCPR.

[31] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.

[32] Sen Wang,et al. DeepVO: Towards end-to-end visual odometry with deep Recurrent Convolutional Neural Networks , 2017, 2017 IEEE International Conference on Robotics and Automation (ICRA).

[33] Fabio Tozeto Ramos,et al. Visual odometry learning for unmanned aerial vehicles , 2011, 2011 IEEE International Conference on Robotics and Automation.