论文信息 - Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing

Finding perfect rendezvous on the go: accurate mobile visual localization and its applications to routing

While on the go, more and more people are using their phones to enjoy ubiquitous location-based services (LBS). One of the fundamental problems of LBS is localization. Researchers are now investigating ways to use a phone-captured image for localization as it contains more scene context information than the embedded sensors. In this paper, we present a novel approach to mobile visual localization that accurately senses geographic scene context according to the current image (typically associated with a rough GPS position). Unlike most existing visual localization methods, the proposed approach is capable of providing a complete set of more accurate parameters about the scene geo---including the actual locations of both the mobile user and perhaps more importantly the captured scene along with the viewing direction. Our approach takes advantage of advanced techniques for large-scale image retrieval and 3D model reconstruction from photos. Specifically, we first perform joint geo-visual clustering in the cloud to generate scene clusters, with each scene represented by a 3D model. The 3D scene models are then indexed using a visual vocabulary tree structure. The phone-captured image is used to retrieve the relevant scene models, then aligned with the models, and further registered to the real-world map. Our approach achieves an estimation accuracy of user location within 14 meters, viewing direction within 9 degrees, and scene location within 21 meters. Such a complete set of accurate geo-parameters can lead to various LBS applications for routing that cannot be achieved with most existing methods. In particular, we showcase three novel applications: 1) accurate self-localization, 2) collaborative localization for rendezvous routing, and 3) routing for photographing. The evaluations through user studies indicate these applications are effective for facilitating the perfect rendezvous for mobile users.

[1] Yannis Avrithis,et al. Retrieving landmark and non-landmark images from community photo collections , 2010, ACM Multimedia.

[2] Michael Kroepfl,et al. Efficiently locating photographs in many panoramas , 2010, GIS '10.

[3] Wen Gao,et al. Towards low bit rate mobile visual search with multiple-channel coding , 2011, ACM Multimedia.

[4] Qi Tian,et al. Spatial coding for large scale partial-duplicate web image search , 2010, ACM Multimedia.

[5] Delbert Dueck,et al. Clustering by Passing Messages Between Data Points , 2007, Science.

[6] David Nistér,et al. An efficient solution to the five-point relative pose problem , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] Jiebo Luo,et al. Geotagging in multimedia and computer vision—a survey , 2010, Multimedia Tools and Applications.

[8] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[9] Steven M. Seitz,et al. Photo tourism: exploring photo collections in 3D , 2006, ACM Trans. Graph..

[10] Anas Al-Nuaimi,et al. Mobile Visual Location Recognition , 2013 .

[11] Xin Chen,et al. City-scale landmark identification on mobile devices , 2011, CVPR 2011.

[12] Tat-Seng Chua,et al. ViewFocus: explore places of interests on Google maps using photos with view direction filtering , 2009, MM '09.

[13] Jan-Michael Frahm,et al. Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.

[14] Jiebo Luo,et al. Beyond GPS: determining the camera viewing direction of a geotagged image , 2010, ACM Multimedia.

[15] Torsten Sattler,et al. Fast image-based localization using direct 2D-to-3D matching , 2011, 2011 International Conference on Computer Vision.

[16] Wei Zhang,et al. Image Based Localization in Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[17] Antonio Torralba,et al. SIFT Flow: Dense Correspondence across Different Scenes , 2008, ECCV.

[18] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19] Barry Smyth,et al. The social camera: a case-study in contextual image recommendation , 2011, IUI '11.

[20] Klas Josephson,et al. Pose estimation with radial distortion and unknown focal length , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[21] Jan-Michael Frahm,et al. From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[22] Alexei A. Efros,et al. IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[23] Daniel P. Huttenlocher,et al. Location Recognition Using Prioritized Feature Matching , 2010, ECCV.

[24] Rongrong Ji,et al. Active query sensing for mobile location search , 2011, ACM Multimedia.

[25] Sunil Arya,et al. An optimal algorithm for approximate nearest neighbor searching fixed dimensions , 1998, JACM.

[26] G LoweDavid,et al. Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[27] Mubarak Shah,et al. Accurate Image Localization Based on Google Maps Street View , 2010, ECCV.

[28] Richard Szeliski,et al. City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[29] Bernhard P. Wrobel,et al. Multiple View Geometry in Computer Vision , 2001 .

[30] AnguelovDragomir,et al. Google Street View , 2010 .