Real-time monocular image-based 6-DoF localization

In this paper we present a new real-time image-based localization method for scenes that have been reconstructed offline using structure from motion. From input video, our method continuously computes six-degree-of-freedom camera pose estimates by efficiently tracking natural features and matching them to 3D points reconstructed by structure from motion. Our main contribution lies in efficiently interleaving a fast keypoint tracker that uses inexpensive binary feature descriptors with a new approach for direct 2D-to-3D matching. Our 2D-to-3D matching scheme avoids the need for online extraction of scale-invariant features. Instead, offline we construct an indexed database containing multiple DAISY descriptors per 3D point extracted at multiple scales. The key to the efficiency of our method is invoking DAISY descriptor extraction and matching sparingly during localization, and in distributing this computation over a temporal window of successive frames. This enables the system to run in real-time and achieve low per-frame latency over long durations. Our algorithm runs at over 30 Hz on a laptop and at 12 Hz on a low-power computer suitable for onboard computation on a mobile robot such as a micro-aerial vehicle. We have evaluated our method using ground truth and present results on several challenging indoor and outdoor sequences.

[1]  Éric Marchand,et al.  Real-time markerless tracking for augmented reality: the virtual visual servoing framework , 2006, IEEE Transactions on Visualization and Computer Graphics.

[2]  Paul H. J. Kelly,et al.  SLAM++: Simultaneous Localisation and Mapping at the Level of Objects , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Vincent Lepetit,et al.  Fast Keypoint Recognition Using Random Ferns , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Ian D. Reid,et al.  Real-Time SLAM Relocalisation , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[6]  Daewon Lee,et al.  Build Your Own Quadrotor: Open-Source Projects on Unmanned Aerial Vehicles , 2012, IEEE Robotics & Automation Magazine.

[7]  Paul Newman,et al.  FAB-MAP: Probabilistic Localization and Mapping in the Space of Appearance , 2008, Int. J. Robotics Res..

[8]  Eli Shechtman,et al.  In defense of Nearest-Neighbor based image classification , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Hauke Strasdat,et al.  Scalable active matching , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[10]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[11]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[12]  Vincent Lepetit,et al.  Keypoint recognition using randomized trees , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Horst Bischof,et al.  Visual Landmark-Based Localization for MAVs Using Incremental Feature Updates , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[14]  Daniel P. Huttenlocher,et al.  Location Recognition Using Prioritized Feature Matching , 2010, ECCV.

[15]  Vincent Lepetit,et al.  A fast local descriptor for dense matching , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Vincent Lepetit,et al.  BRIEF: Binary Robust Independent Elementary Features , 2010, ECCV.

[17]  Richard Szeliski,et al.  Towards Internet-scale multi-view stereo , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[18]  Richard Szeliski,et al.  Pushing the Envelope of Modern Methods for Bundle Adjustment , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[19]  Natasha Gelfand,et al.  SURFTrac: Efficient tracking and continuous object recognition using local feature descriptors , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[20]  Roland Siegwart,et al.  Monocular‐SLAM–based navigation for autonomous micro helicopters in GPS‐denied environments , 2011, J. Field Robotics.

[21]  Hujun Bao,et al.  Keyframe-based real-time camera tracking , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[22]  David G. Lowe,et al.  Scene modelling, recognition and tracking with invariant image features , 2004, Third IEEE and ACM International Symposium on Mixed and Augmented Reality.

[23]  Dieter Schmalstieg,et al.  Real-Time Detection and Tracking for Augmented Reality on Mobile Phones , 2010, IEEE Transactions on Visualization and Computer Graphics.

[24]  Pascal Fua,et al.  Worldwide Pose Estimation Using 3D Point Clouds , 2012, ECCV.

[25]  Wolfram Burgard,et al.  An evaluation of the RGB-D SLAM system , 2012, 2012 IEEE International Conference on Robotics and Automation.

[26]  Horst Bischof,et al.  Natural landmark-based monocular localization for MAVs , 2011, 2011 IEEE International Conference on Robotics and Automation.

[27]  Dieter Fox,et al.  RGB-D mapping: Using Kinect-style depth cameras for dense 3D modeling of indoor environments , 2012, Int. J. Robotics Res..

[28]  Michael F. Cohen,et al.  Real-time image-based 6-DOF localization in large-scale environments , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[29]  Marc Pollefeys,et al.  PIXHAWK: A micro aerial vehicle design for autonomous flight using onboard computer vision , 2012, Auton. Robots.

[30]  Matthew A. Brown,et al.  Picking the best DAISY , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Albert S. Huang,et al.  Visual Odometry and Mapping for Autonomous Flight Using an RGB-D Camera , 2011, ISRR.

[32]  G. Klein,et al.  Parallel Tracking and Mapping for Small AR Workspaces , 2007, 2007 6th IEEE and ACM International Symposium on Mixed and Augmented Reality.

[33]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[34]  Michel Dhome,et al.  Monocular Vision for Mobile Robot Localization and Autonomous Navigation , 2007, International Journal of Computer Vision.

[35]  Roland Siegwart,et al.  Vision based MAV navigation in unknown and unstructured environments , 2010, 2010 IEEE International Conference on Robotics and Automation.

[36]  Michael Milford,et al.  Vision-based place recognition: how low can you go? , 2013, Int. J. Robotics Res..

[37]  James J. Little,et al.  Vision-based global localization and mapping for mobile robots , 2005, IEEE Transactions on Robotics.

[38]  Torsten Sattler,et al.  Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[39]  David W. Murray,et al.  Wide-area augmented reality using camera tracking and mapping in multiple regions , 2011, Comput. Vis. Image Underst..

[40]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[41]  Olivier Stasse,et al.  MonoSLAM: Real-Time Single Camera SLAM , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[42]  T. Başar,et al.  A New Approach to Linear Filtering and Prediction Problems , 2001 .

[43]  David W. Murray,et al.  Keyframe-based recognition and localization during video-rate parallel tracking and mapping , 2011, Image Vis. Comput..

[44]  Roberto Cipolla,et al.  An Image-Based System for Urban Navigation , 2004, BMVC.

[45]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[46]  Berthold K. P. Horn,et al.  Closed-form solution of absolute orientation using unit quaternions , 1987 .

[47]  Torsten Sattler,et al.  Improving Image-Based Localization by Active Correspondence Search , 2012, ECCV.

[48]  Dieter Schmalstieg,et al.  Wide area localization on mobile phones , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[49]  Sunil Arya,et al.  Approximate nearest neighbor queries in fixed dimensions , 1993, SODA '93.

[50]  Il Hong Suh,et al.  Active-semantic localization with a single consumer-grade camera , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[51]  Roland Siegwart,et al.  Onboard IMU and monocular vision based control for MAVs in unknown in- and outdoor environments , 2011, 2011 IEEE International Conference on Robotics and Automation.