Geo-semantic segmentation

The availability of GIS (Geographical Information System) databases for many urban areas, provides a valuable source of information for improving the performance of many computer vision tasks. In this paper, we propose a method which leverages information acquired from GIS databases to perform semantic segmentation of the image alongside with geo-referencing each semantic segment with its address and geo-location. First, the image is segmented into a set of initial super-pixels. Then, by projecting the information from GIS databases, a set of priors are obtained about the approximate location of the semantic entities such as buildings and streets in the image plane. However, there are significant inaccuracies (misalignments) in the projections, mainly due to inaccurate GPS-tags and camera parameters. In order to address this misalignment issue, we perform data fusion such that it improves the segmentation and GIS projections accuracy simultaneously with an iterative approach. At each iteration, the projections are evaluated and weighted in terms of reliability, and then fused with the super-pixel segmentations. First segmentation is performed using random walks, based on the GIS projections. Then the global transformation which best aligns the projections to their corresponding semantic entities is computed and applied to the projections to further align them to the content of the image. The iterative approach continues until the projections and segments are well aligned.

[1]  Jianxiong Xiao,et al.  Image-based street-side city modeling , 2009, SIGGRAPH 2009.

[2]  Mubarak Shah,et al.  GPS-Tag Refinement Using Random Walks with an Adaptive Damping Factor , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Hideo Saito,et al.  AR GIS on a Physical Map Based on Map Image Retrieval Using LLAH Tracking , 2009, MVA.

[4]  Mubarak Shah,et al.  GIS-Assisted Object Detection and Geospatial Localization , 2014, ECCV.

[5]  Qinping Zhao,et al.  Rectilinear parsing of architecture in urban environment , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[6]  Rama Chellappa,et al.  Entropy rate superpixel segmentation , 2011, CVPR 2011.

[7]  Jianxiong Xiao,et al.  Supervised Label Transfer for Semantic Segmentation of Street Scenes , 2010, ECCV.

[8]  Michael Wimmer,et al.  Interactive Coherence‐Based Façade Modeling , 2012, Comput. Graph. Forum.

[9]  Miguel Á. Carreira-Perpiñán,et al.  Multiscale conditional random fields for image labeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[10]  Dong Liu,et al.  Tag ranking , 2009, WWW '09.

[11]  Nikos Paragios,et al.  Segmentation of building facades using procedural shape priors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[12]  Roberto Cipolla,et al.  Segmentation and Recognition Using Structure from Motion Point Clouds , 2008, ECCV.

[13]  J. Kosecka,et al.  Semantic Segmentation of Urban Environments into Object and Background Categories , 2013 .

[14]  Jianxiong Xiao,et al.  Image-based façade modeling , 2008, ACM Trans. Graph..

[15]  Jianxiong Xiao,et al.  Image-based street-side city modeling , 2009, ACM Trans. Graph..

[16]  Luc Van Gool,et al.  Procedural modeling of buildings , 2006, ACM Trans. Graph..

[17]  Takeo Kanade,et al.  Computer Vision – ECCV 2014 , 2014, Lecture Notes in Computer Science.

[18]  Lu Wang,et al.  A robust approach for automatic registration of aerial images with untextured aerial LiDAR data , 2009, CVPR.

[19]  Alexei A. Efros,et al.  Automatic photo pop-up , 2005, ACM Trans. Graph..