Robust User Community-Aware Landmark Photo Retrieval

Given a query photo characterizing a location-aware landmark shot by a user, landmark retrieval is about returning a set of photos ordered in their similarities to the photo. Existing studies on landmark retrieval focus on exploiting location-aware visual features or attributes to conduct a matching process between candidate images and a query image. However, these approaches are based on a hypothesis that a landmark of interest is well-captured and distinctive enough to be distinguished from others. In fact, distinctive landmarks may be biasedly taken due to bad viewpoints or angles. This will discourage the recognition results if a biased query photo is issued. In this paper, we present a novel approach towards landmark retrieval by exploiting the dimension of user community. Our approach in this system consists of three steps. First, we extract communities based on user interest which can characterize a group of users in terms of their social media activities such as user-generated contents/comments. Then, a group of photos that are recommended by the community to which the query user belongs, together with the query photo, can constitute a set of multiple queries. Finally, a pattern mining algorithm is presented to discover regular landmark-specific patterns from this multi-query set. These patterns can faithfully represent the characteristics of a landmark of interest. Experiments conducted on benchmarks are conducted to show the effectiveness of our approach.

[1]  Liujuan Cao,et al.  Quality Assessment on User Generated Image for Mobile Search Application , 2013, MMM.

[2]  Tat-Seng Chua,et al.  Tour the world: Building a web-scale landmark recognition engine , 2009, CVPR.

[3]  Jing Ren,et al.  Building a Large Scale Test Collection for Effective Benchmarking of Mobile Landmark Search , 2013, MMM.

[4]  Tinne Tuytelaars,et al.  Mining Multiple Queries for Image Retrieval: On-the-Fly Learning of an Object-Specific Mid-level Representation , 2013, 2013 IEEE International Conference on Computer Vision.

[5]  Wei-keng Liao,et al.  User-Interest based Community Extraction in Social Networks , 2012, KDD 2012.

[6]  Lin Wu,et al.  Efficient image and tag co-ranking: a bregman divergence optimization method , 2013, ACM Multimedia.

[7]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[8]  Shuicheng Yan,et al.  Near-duplicate keyframe retrieval by nonrigid image matching , 2008, ACM Multimedia.

[9]  Yan Liu,et al.  Latent feature learning in social media network , 2013, ACM Multimedia.

[10]  Jilles Vreeken,et al.  Krimp: mining itemsets that compress , 2011, Data Mining and Knowledge Discovery.

[11]  Michael Isard,et al.  Total Recall: Automatic Query Expansion with a Generative Feature Model for Object Retrieval , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[12]  Jiri Matas,et al.  Efficient representation of local geometry for large scale object retrieval , 2009, CVPR.

[13]  Xuelong Li,et al.  Visual-Textual Joint Relevance Learning for Tag-Based Social Image Search , 2013, IEEE Transactions on Image Processing.

[14]  Lin Wu,et al.  Exploiting Correlation Consensus: Towards Subspace Clustering for Multi-modal Data , 2014, ACM Multimedia.

[15]  M. Newman,et al.  Finding community structure in very large networks. , 2004, Physical review. E, Statistical, nonlinear, and soft matter physics.

[16]  Alexei A. Efros,et al.  What makes Paris look like Paris? , 2015, Commun. ACM.

[17]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[18]  Xirong Li,et al.  Classifying tag relevance with relevant positive and negative examples , 2013, ACM Multimedia.

[19]  Matthieu Guillaumin,et al.  Combining Image-Level and Segment-Level Models for Automatic Annotation , 2012, MMM.

[20]  Xueming Qian,et al.  LCMKL: latent-community and multi-kernel learning based image annotation , 2013, CIKM.

[21]  Xiaochun Cao,et al.  Camera calibration and geolocation estimation from two shadow trajectories , 2010 .

[22]  Yang Wang,et al.  Towards metric fusion on multi-view data: a cross-view based graph random walk approach , 2013, CIKM.

[23]  Xueming Qian,et al.  Mobile image retrieval using multi-photos as query , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).

[24]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  P. Grünwald The Minimum Description Length Principle (Adaptive Computation and Machine Learning) , 2007 .

[26]  Lin Wu,et al.  Shifting Hypergraphs by Probabilistic Voting , 2014, PAKDD.

[27]  Changsheng Xu,et al.  GIANT: geo-informative attributes for location recognition and exploration , 2013, ACM Multimedia.

[28]  Xiaochun Cao,et al.  Geo-location estimation from two shadow trajectories , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[29]  Yuan Yan Tang,et al.  GPS Estimation for Places of Interest From Social Users' Uploaded Photos , 2013, IEEE Transactions on Multimedia.

[30]  Andrew Zisserman,et al.  Three things everyone should know to improve object retrieval , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Jian Pei,et al.  An Iterative Fusion Approach to Graph-Based Semi-Supervised Learning from Multiple Views , 2014, PAKDD.

[32]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.