Semi-parametric learning for visual odometry

This paper addresses the visual odometry problem from a machine learning perspective. Optical flow information from a single camera is used as input for a multiple-output Gaussian process (MOGP) framework, that estimates linear and angular camera velocities. This approach has several benefits. (1) It substitutes the need for conventional camera calibration, by introducing a semi-parametric model that is able to capture nuances that a strictly parametric geometric model struggles with. (2) It is able to recover absolute scale if a range sensor (e.g. a laser scanner) is used for ground-truth, provided that training and testing data share a certain similarity. (3) It is naturally able to provide measurement uncertainties. We extend the standard MOGP framework to include the ability to infer joint estimates (full covariance matrices) for both translation and rotation, taking advantage of the fact that all estimates are correlated since they are derived from the same vehicle. We also modify the common zero mean assumption of a Gaussian process to accommodate a standard geometric model of the camera, thus providing an initial estimate that is then further refined by the non-parametric model. Both Gaussian process hyperparameters and camera parameters are trained simultaneously, so there is still no need for traditional camera calibration, although if these values are known they can be used to speed up training. This approach has been tested in a wide variety of situations, both 2D in urban and off-road environments (two degrees of freedom) and 3D with unmanned aerial vehicles (six degrees of freedom), with results that are comparable to standard state-of-the-art visual odometry algorithms and even more traditional methods, such as wheel encoders and laser-based Iterative Closest Point. We also test its limits to generalize over environment changes by varying training and testing conditions independently, and also by changing cameras between training and testing.

[1]  Carlo Tomasi,et al.  Good features to track , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[3]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[4]  K. Konolige Rough Terrain Visual Odometry , 2007 .

[5]  Michael E. Tipping,et al.  Probabilistic Principal Component Analysis , 1999 .

[6]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[7]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[8]  Kostas Daniilidis,et al.  Monocular visual odometry in urban environments using an omnidirectional camera , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[9]  Gaurav S. Sukhatme,et al.  AN EXPERIMENTAL STUDY OF AERIAL STEREO VISUAL ODOMETRY , 2007 .

[10]  Hugh F. Durrant-Whyte,et al.  Contextual occupancy maps using Gaussian processes , 2009, 2009 IEEE International Conference on Robotics and Automation.

[11]  Frank Dellaert,et al.  The Expectation Maximization Algorithm , 2002 .

[12]  Kurt Konolige,et al.  Visual Odometry Using Sparse Bundle Adjustment on an Autonomous Outdoor Vehicle , 2006, AMS.

[13]  Javier Ibanez Guzman,et al.  Accurate visual odometry from a rear parking camera , 2011, 2011 IEEE Intelligent Vehicles Symposium (IV).

[14]  S.S. da Costa Botelho,et al.  Visual odometry and mapping for Underwater Autonomous Vehicles , 2009, 2009 6th Latin American Robotics Symposium (LARS 2009).

[15]  Carlo Tomasi,et al.  Is Structure-from-Motion Worth Pursuing? , 1996 .

[16]  Peter I. Corke,et al.  Experiments with Underwater Robot Localization and Tracking , 2007, Proceedings 2007 IEEE International Conference on Robotics and Automation.

[17]  Albert S. Huang,et al.  Visual Odometry and Mapping for Autonomous Flight Using an RGB-D Camera , 2011, ISRR.

[18]  Roland Siegwart,et al.  Absolute scale in structure from motion from a single vehicle mounted camera by exploiting nonholonomic constraints , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[19]  Hans P. Moravec Obstacle avoidance and navigation in the real world by a seeing robot rover , 1980 .

[20]  Hugh F. Durrant-Whyte,et al.  Gaussian Process modeling of large scale terrain , 2009, ICRA.

[21]  C. Anderson,et al.  Quantitative Methods for Current Environmental Issues , 2005 .

[22]  Christopher K. I. Williams Computation with Infinite Neural Networks , 1998, Neural Computation.

[23]  James R. Bergen,et al.  Visual odometry for ground vehicle applications , 2006, J. Field Robotics.

[24]  Fabio Tozeto Ramos,et al.  Semi-parametric models for visual odometry , 2012, 2012 IEEE International Conference on Robotics and Automation.

[25]  Simon Lacroix,et al.  Vision-Based SLAM: Stereo and Monocular Approaches , 2007, International Journal of Computer Vision.

[26]  Takeo Kanade,et al.  An Iterative Image Registration Technique with an Application to Stereo Vision , 1981, IJCAI.

[27]  Illah R. Nourbakhsh,et al.  Techniques for evaluating optical flow for visual odometry in extreme terrain , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[28]  Larry H. Matthies,et al.  Visual odometry on the Mars Exploration Rovers , 2005, 2005 IEEE International Conference on Systems, Man and Cybernetics.

[29]  Fabio Tozeto Ramos,et al.  Multi-task Learning of Visual Odometry Estimators , 2010, ISER.

[30]  Andrew J. Davison,et al.  Real-time simultaneous localisation and mapping with a single camera , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[31]  Gaurav S. Sukhatme,et al.  Combined Visual and Inertial Navigation for an Unmanned Aerial Vehicle , 2008, FSR.

[32]  Bernhard P. Wrobel,et al.  Multiple View Geometry in Computer Vision , 2001 .

[33]  Peter I. Corke,et al.  Omnidirectional visual odometry for a planetary rover , 2004, 2004 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS) (IEEE Cat. No.04CH37566).

[34]  D. Higdon Space and Space-Time Modeling using Process Convolutions , 2002 .

[35]  Andrew Howard,et al.  Real-time stereo visual odometry for autonomous ground vehicles , 2008, 2008 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[36]  Sethu Vijayakumar,et al.  Multi-task Gaussian Process Learning of Robot Inverse Dynamics , 2008, NIPS.

[37]  Frank Dellaert,et al.  Learning general optical flow subspaces for egomotion estimation and detection of motion anomalies , 2009, CVPR.

[38]  Evangelos E. Milios,et al.  Robot Pose Estimation in Unknown Environments by Matching 2D Range Scans , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[39]  Edwin V. Bonilla,et al.  Multi-task Gaussian Process Prediction , 2007, NIPS.

[40]  Roland Siegwart,et al.  Appearance-Guided Monocular Omnidirectional Visual Odometry for Outdoor Ground Vehicles , 2008, IEEE Transactions on Robotics.

[41]  Geoffrey E. Hinton,et al.  Bayesian Learning for Neural Networks , 1995 .

[42]  James J. Little,et al.  Vision-based mobile robot localization and mapping using scale-invariant features , 2001, Proceedings 2001 ICRA. IEEE International Conference on Robotics and Automation (Cat. No.01CH37164).

[43]  Tucker R. Balch,et al.  Memory-based learning for visual odometry , 2008, 2008 IEEE International Conference on Robotics and Automation.

[44]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[45]  Carl E. Rasmussen,et al.  Gaussian processes for machine learning , 2005, Adaptive computation and machine learning.

[46]  Noel A Cressie,et al.  Statistics for Spatial Data. , 1992 .

[47]  Matthew R. Walter,et al.  Exactly Sparse Extended Information Filters for Feature-based SLAM , 2007, Int. J. Robotics Res..