Evaluation of non-geometric methods for visual odometry

Visual Odometry (VO) is one of the fundamental building blocks of modern autonomous robot navigation and mapping. While most state-of-the-art techniques use geometrical methods for camera ego-motion estimation from optical flow vectors, in the last few years learning approaches have been proposed to solve this problem. These approaches are emerging and there is still much to explore. This work follows this track applying Kernel Machines to monocular visual ego-motion estimation. Unlike geometrical methods, learning-based approaches to monocular visual odometry allow issues like scale estimation and camera calibration to be overcome, assuming the availability of training data. While some previous works have proposed learning paradigms to VO, to our knowledge no extensive evaluation of applying kernel-based methods to Visual Odometry has been conducted. To fill this gap, in this work we consider publicly available datasets and perform several experiments in order to set a comparison baseline with traditional techniques. Experimental results show good performances of learning algorithms and set them as a solid alternative to the computationally intensive and complex to implement geometrical techniques. We stress the advantages of non-geometric (learned) VO as an alternative or an addition to standard geometric methods.Ego-motion is computed with state-of-the art regression techniques, namely Support Vector Machines (SVM) and Gaussian Processes (GP).To our knowledge this is the first time SVM have been applied to VO problem.We conduct extensive evaluation on three publicly available datasets, spanning both indoor and outdoor environments.The experiments show that non-geometric VO is a good alternative, or addition, to standard VO systems.

[1]  Andreas Geiger,et al.  Are we ready for autonomous driving? The KITTI vision benchmark suite , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Andrew Zisserman,et al.  Robust Detection of Degenerate Configurations while Estimating the Fundamental Matrix , 1998, Comput. Vis. Image Underst..

[3]  Lorenzo Porzi,et al.  Visual-inertial tracking on Android for Augmented Reality applications , 2012, 2012 IEEE Workshop on Environmental Energy and Structural Monitoring Systems (EESMS).

[4]  Fabio Tozeto Ramos,et al.  Semi-parametric models for visual odometry , 2012, 2012 IEEE International Conference on Robotics and Automation.

[5]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[6]  Roland Siegwart,et al.  Fusion of IMU and Vision for Absolute Scale Estimation in Monocular SLAM , 2011, J. Intell. Robotic Syst..

[7]  Paolo Valigi,et al.  A transfer learning approach for multi-cue semantic place recognition , 2013, 2013 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[8]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[9]  Sanjiv Singh,et al.  Monocular Visual Odometry using a Planar Road Model to Solve Scale Ambiguity , 2011, ECMR.

[10]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[11]  Gary R. Bradski,et al.  ORB: An efficient alternative to SIFT or SURF , 2011, 2011 International Conference on Computer Vision.

[12]  Friedrich Fraundorfer,et al.  Visual Odometry Part I: The First 30 Years and Fundamentals , 2022 .

[13]  Andreas Möller,et al.  Scale-preserving long-term visual odometry for indoor navigation , 2012, 2012 International Conference on Indoor Positioning and Indoor Navigation (IPIN).

[14]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[15]  Andreas Geiger,et al.  Visual odometry based on stereo image sequences with RANSAC-based outlier rejection scheme , 2010, 2010 IEEE Intelligent Vehicles Symposium.

[16]  Ian D. Reid,et al.  Manhattan scene understanding using monocular, stereo, and 3D features , 2011, 2011 International Conference on Computer Vision.

[17]  Yasir Latif,et al.  Realizing, reversing, recovering: Incremental robust loop closing over time using the iRRR algorithm , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[18]  Julius Ziegler,et al.  StereoScan: Dense 3d reconstruction in real-time , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[19]  Luis Miguel Bergasa,et al.  On combining visual SLAM and dense scene flow to increase the robustness of localization and mapping in dynamic environments , 2012, 2012 IEEE International Conference on Robotics and Automation.

[20]  James R. Bergen,et al.  Visual odometry , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[21]  Wolfram Burgard,et al.  A visual odometry framework robust to motion blur , 2009, 2009 IEEE International Conference on Robotics and Automation.

[22]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[23]  F. Dellaert,et al.  Learning general optical flow subspaces for egomotion estimation and detection of motion anomalies , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Fabio Ramos,et al.  Semi-parametric learning for visual odometry , 2013, Int. J. Robotics Res..

[25]  Kostas Daniilidis,et al.  Monocular visual odometry in urban environments using an omnidirectional camera , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[26]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Andreas Geiger,et al.  Efficient Large-Scale Stereo Matching , 2010, ACCV.

[28]  Paolo Valigi,et al.  A discriminative approach for appearance based loop closing , 2012, 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[29]  Fabio Tozeto Ramos,et al.  Visual odometry learning for unmanned aerial vehicles , 2011, 2011 IEEE International Conference on Robotics and Automation.

[30]  Giulio Fontana,et al.  Rawseeds ground truth collection systems for indoor self-localization and mapping , 2009, Auton. Robots.

[31]  Tucker R. Balch,et al.  Memory-based learning for visual odometry , 2008, 2008 IEEE International Conference on Robotics and Automation.

[32]  Michal Irani,et al.  Multi-Frame Correspondence Estimation Using Subspace Constraints , 2002, International Journal of Computer Vision.

[33]  Binoy Pinto,et al.  Speeded Up Robust Features , 2011 .

[34]  Tom Drummond,et al.  Scalable Monocular SLAM , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[35]  David G. Lowe,et al.  Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[36]  Christopher K. I. Williams Computation with Infinite Neural Networks , 1998, Neural Computation.

[37]  Hauke Strasdat,et al.  Scale Drift-Aware Large Scale Monocular SLAM , 2010, Robotics: Science and Systems.

[38]  Markus Vincze,et al.  Simultaneous Motion and Structure Estimation by Fusion of Inertial and Vision Data , 2007, Int. J. Robotics Res..

[39]  Javier Civera,et al.  Inverse Depth Parametrization for Monocular SLAM , 2008, IEEE Transactions on Robotics.