论文信息 - Accurate Camera Registration in Urban Environments Using High-Level Feature Matching

Accurate Camera Registration in Urban Environments Using High-Level Feature Matching

We propose a method for accurate camera pose estimation in urban environments from single images and 2D maps made of the surrounding buildings’ outlines. Our approach bridges the gap between learning-based approaches and geometric approaches: We use recent semantic segmentation techniques for extracting the buildings’ edges and the façades’ normals in the images and minimal solvers [14] to compute the camera pose accurately and robustly. We propose two such minimal solvers: one based on three correspondences of buildings’ corners from the image and the 2D map and another one based on two corner correspondences plus one façade correspondence. We show on a challenging dataset that, compared to recent state-of-the-art [1], this approach is both, faster and more accurate.

[1] Mubarak Shah,et al. Accurate Image Localization Based on Google Maps Street View , 2010, ECCV.

[2] Roberto Cipolla,et al. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[3] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[4] Tsuhan Chen,et al. GPS Refinement and Camera Orientation Estimation from a Single Image and a 2D Map , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[5] Zuzana Kukelova,et al. A general solution to the P4P problem for camera with unknown focal length , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[6] Patrick Pérez,et al. Incremental dense semantic stereo fusion for large-scale semantic scene reconstruction , 2015, 2015 IEEE International Conference on Robotics and Automation (ICRA).

[7] Peter F. Sturm,et al. Pose estimation using both points and lines for geo-localization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[8] Jan Dirk Wegner,et al. Large-Scale Semantic 3D Reconstruction: An Adaptive Multi-resolution Model for Multi-class Volumetric Labeling , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9] Jianliang Tang,et al. Complete Solution Classification for the Perspective-Three-Point Problem , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[10] Mayank Bansal,et al. Geometric Urban Geo-localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[11] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Philip David,et al. Orientation descriptors for localization in urban environments , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[13] Marc Pollefeys,et al. Large Scale Visual Geo-Localization of Images in Mountainous Terrain , 2012, ECCV.

[14] Richard Szeliski,et al. City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[15] Zuzana Kukelova,et al. Real-Time Solution to the Absolute Pose Problem with Unknown Radial Distortion and Focal Length , 2013, 2013 IEEE International Conference on Computer Vision.

[16] Michel Dhome,et al. Determination of the Attitude of 3D Objects from a Single Perspective View , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[17] Robert C. Bolles,et al. Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[18] Matthijs C. Dorst. Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[19] Robert M. Haralick,et al. Review and analysis of solutions of the three point perspective pose estimation problem , 1994, International Journal of Computer Vision.

[20] David Nistér,et al. An efficient solution to the five-point relative pose problem , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[21] Liang-Tien Chia,et al. Estimating camera pose from a single urban ground-view omnidirectional image and a 2D building outline map , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[22] Vincent Lepetit,et al. Instant Outdoor Localization and SLAM Initialization from 2.5D Maps , 2015, IEEE Transactions on Visualization and Computer Graphics.

[23] Marc Pollefeys,et al. Registration of Spherical Panoramic Images with Cadastral 3D Models , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[24] Amir Roshan Zamir,et al. City scale geo-spatial trajectory estimation of a moving camera , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[25] Vincent Lepetit,et al. Learning to Align Semantic Segmentation and 2.5D Maps for Geolocalization , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[26] Zuzana Kukelova,et al. Automatic Generator of Minimal Problem Solvers , 2008, ECCV.