论文信息 - Semantic segmentation for 3D localization in urban environments

Semantic segmentation for 3D localization in urban environments

We show how to use simple 2.5D maps of buildings and recent advances in image segmentation and machine learning to geo-localize an input image of an urban scene: We first extract the façades of the buildings and their edges from the image, and then look for the orientation and location that align a 3D rendering of the map with these segments. We discuss how to use a 3D tracking system to acquire the data required for training the segmentation method, the segmentation itself, and how we use the segmentations to evaluate the quality of the alignment.

Vincent Lepetit | Anil Armagan | Martin Hirzer

[1] Philip David,et al. Orientation descriptors for localization in urban environments , 2011, 2011 IEEE/RSJ International Conference on Intelligent Robots and Systems.

[2] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[3] Marc Pollefeys,et al. Large Scale Visual Geo-Localization of Images in Mountainous Terrain , 2012, ECCV.

[4] Liang-Tien Chia,et al. Estimating camera pose from a single urban ground-view omnidirectional image and a 2D building outline map , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[5] Roberto Cipolla,et al. PoseNet: A Convolutional Network for Real-Time 6-DOF Camera Relocalization , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6] Roberto Cipolla,et al. SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Horst Bischof,et al. BUILDING FAÇADE SEPARATION IN VERTICAL AERIAL IMAGES , 2012 .

[8] Vincent Lepetit,et al. Instant Outdoor Localization and SLAM Initialization from 2.5D Maps , 2015, IEEE Transactions on Visualization and Computer Graphics.

[9] Marc Pollefeys,et al. Registration of Spherical Panoramic Images with Cadastral 3D Models , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.

[10] Tsuhan Chen,et al. GPS Refinement and Camera Orientation Estimation from a Single Image and a 2D Map , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.

[11] Trevor Darrell,et al. Fully Convolutional Networks for Semantic Segmentation , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12] Peter V. Gehler,et al. Efficient Facade Segmentation Using Auto-context , 2015, 2015 IEEE Winter Conference on Applications of Computer Vision.

[13] Richard Szeliski,et al. City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[14] Horst Bischof,et al. Unsupervised Facade Segmentation Using Repetitive Patterns , 2010, DAGM-Symposium.

[15] Amir Roshan Zamir,et al. City scale geo-spatial trajectory estimation of a moving camera , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[16] Mubarak Shah,et al. Accurate Image Localization Based on Google Maps Street View , 2010, ECCV.

[17] Thomas Brox,et al. U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[18] Peter F. Sturm,et al. Pose estimation using both points and lines for geo-localization , 2011, 2011 IEEE International Conference on Robotics and Automation.

[19] Mayank Bansal,et al. Geometric Urban Geo-localization , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[20] Marc Pollefeys,et al. Leveraging Topographic Maps for Image to Terrain Alignment , 2012, 2012 Second International Conference on 3D Imaging, Modeling, Processing, Visualization & Transmission.