Accurate sensing of scene geo-context via mobile visual localization

Image geo-tagging has drawn a great deal of attention in recent years. The geographic information associated with images can be used to promote potential applications such as location recognition or virtual navigation. In this paper, we propose a novel approach for accurate mobile image geo-tagging in urban areas. The approach is able to provide a comprehensive set of geo-context information based on the current image, including the real location of the camera and the viewing angle, as well as the location of the captured scene. Moreover, the parsed building facades and their geometric structures can also be estimated. First, for the image to be geo-tagged, we perform partial duplicate image retrieval to filter crowd-sourced images capturing the same scene. We then employ the structure-from-motion technique to reconstruct a sparse 3D point cloud of the scene. Meanwhile, the geometric structure of the query image is analyzed to extract building facades. Finally, by combining the reconstructed 3D scene model and the extracted structure information, we can register the camera location and viewing direction to a real-world map. The captured building location and facade orientation are also aligned. The effectiveness of the proposed system is demonstrated by experiment results.

[1]  Qi Tian,et al.  Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[2]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[3]  Richard Szeliski,et al.  Modeling the World from Internet Photo Collections , 2008, International Journal of Computer Vision.

[4]  David G. Lowe,et al.  Fast Approximate Nearest Neighbors with Automatic Algorithm Configuration , 2009, VISAPP.

[5]  Tao Mei,et al.  Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing , 2012, ACM Multimedia.

[6]  Tat-Seng Chua,et al.  ViewFocus: explore places of interests on Google maps using photos with view direction filtering , 2009, MM '09.

[7]  Yannis Avrithis,et al.  Retrieving landmark and non-landmark images from community photo collections , 2010, ACM Multimedia.

[8]  Jing Ren,et al.  Building a Large Scale Test Collection for Effective Benchmarking of Mobile Landmark Search , 2013, MMM.

[9]  Michael Kroepfl,et al.  Efficiently locating photographs in many panoramas , 2010, GIS '10.

[10]  Wen Gao,et al.  Towards low bit rate mobile visual search with multiple-channel coding , 2011, ACM Multimedia.

[11]  Byung K. Yi,et al.  Location Based Services for Mobiles :Technologies and Standards , 2008 .

[12]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[13]  Ming Yang,et al.  Query Specific Fusion for Image Retrieval , 2012, ECCV.

[14]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[16]  Wei Zhang,et al.  Image Based Localization in Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[17]  Jan-Michael Frahm,et al.  Detecting Large Repetitive Structures with Salient Boundaries , 2010, ECCV.

[18]  Nenghai Yu,et al.  AMIGO: accurate mobile image geotagging , 2012, ICIMCS '12.

[19]  Steven M. Seitz,et al.  Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[20]  Horst Bischof,et al.  Unsupervised Facade Segmentation Using Repetitive Patterns , 2010, DAGM-Symposium.

[21]  Wen Gao,et al.  Location Discriminative Vocabulary Coding for Mobile Landmark Search , 2011, International Journal of Computer Vision.

[22]  Xin Chen,et al.  City-scale landmark identification on mobile devices , 2011, CVPR 2011.

[23]  John W. Fisher,et al.  Automatic registration of LIDAR and optical images of urban scenes , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[24]  Wen Gao,et al.  A lowbit rate vocabulary coding scheme for mobile landmark search , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  David Nistér,et al.  An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Jiebo Luo,et al.  Beyond GPS: determining the camera viewing direction of a geotagged image , 2010, ACM Multimedia.