论文信息 - Augmenting mobile city-view image retrieval with context-rich user-contributed photos

Augmenting mobile city-view image retrieval with context-rich user-contributed photos

With the growth of mobile devices, the needs for location-based services are emerging. Taking the advantage of the GPS information, we can roughly estimate a user's location. However, it is necessary to leverage extra information (e.g., photos) to precisely locate the object of interest through mobile devices for further applications such as mobile search. Users can simply take a picture (with GPS enabled) of an interesting target to retrieve the building information. Therefore, the raise of real-time building recognition or retrieval system becomes a challenging problem. The most recent approaches are to recognize buildings by the street-view images; however, the query photos from mobile devices usually contain different lighting conditions. In order to provide a more robust city-view image retrieval system, we propose to augment the visual diversity of database images by integrating the context-rich user-contributed photos from social media. Preliminary experimental results show that the street-view images can provide different angles of the target whereas the user-contributed photos can enhance the diversity of the target. Besides, for the real-time retrieval system, we also combine both visual and GPS constraints in the retrieval process on inverted indexing so that we can achieve a real-time retrieval system.

[1] Tomás Pajdla,et al. Avoiding Confusing Features in Place Recognition , 2010, ECCV.

[2] Alexei A. Efros,et al. IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3] Cordelia Schmid,et al. Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[4] JUSTIN ZOBEL,et al. Inverted files for text search engines , 2006, CSUR.

[5] Winston H. Hsu,et al. GPS, compass, or camera?: investigating effective mobile sensors for automatic search-based image annotation , 2010, ACM Multimedia.

[6] Michael Isard,et al. Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Jiebo Luo,et al. Geotagging in multimedia and computer vision—a survey , 2010, Multimedia Tools and Applications.

[8] Andrew Zisserman,et al. Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[9] David Nistér,et al. Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[10] Xin Chen,et al. City-scale landmark identification on mobile devices , 2011, CVPR 2011.

[11] Jiebo Luo,et al. Beyond GPS: determining the camera viewing direction of a geotagged image , 2010, ACM Multimedia.