Learning from mobile contexts to minimize the mobile location search latency

We propose to learn an extremely compact visual descriptor from the mobile contexts towards low bit rate mobile location search. Our scheme combines location related side information from the mobile devices to adaptively supervise the compact visual descriptor design in a flexible manner, which is very suitable to search locations or landmarks within a bandwidth constraint wireless link. Along with the proposed compact descriptor learning, a large-scale, contextual aware mobile visual search benchmark dataset PKUBench is also introduced, which serves as the first comprehensive benchmark for the quantitative evaluation of how the cheaply available mobile contexts can help the mobile visual search systems. Our proposed contextual learning based compact descriptor has shown to outperform the existing works in terms of compression rate and retrieval effectiveness.

[1]  Wen Gao,et al.  Learning Compact Visual Descriptor for Low Bit Rate Mobile Landmark Search , 2011, IJCAI.

[2]  Qi Tian,et al.  Task-Dependent Visual-Codebook Compression , 2012, IEEE Transactions on Image Processing.

[3]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Xing Xie,et al.  Vocabulary tree incremental indexing for scalable location recognition , 2008, 2008 IEEE International Conference on Multimedia and Expo.

[5]  Bernd Girod,et al.  Inverted Index Compression for Scalable Image Matching , 2010, 2010 Data Compression Conference.

[6]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[7]  Gang Hua,et al.  Discriminant Embedding for Local Image Descriptors , 2007, 2007 IEEE 11th International Conference on Computer Vision.

[8]  Wen Gao,et al.  Towards compact topical descriptors , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Yan Ke,et al.  PCA-SIFT: a more distinctive representation for local image descriptors , 2004, CVPR 2004.

[10]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[11]  Rongrong Ji,et al.  Vocabulary hierarchy optimization for effective and transferable retrieval , 2009, CVPR.

[12]  Horst Bischof,et al.  From structure-from-motion point clouds to fast location recognition , 2009, CVPR.

[13]  Mor Naaman,et al.  How flickr helps us make sense of the world: context and content in community-contributed media collections , 2007, ACM Multimedia.

[14]  Gerard Salton,et al.  Term-Weighting Approaches in Automatic Text Retrieval , 1988, Inf. Process. Manag..

[15]  Luc Van Gool,et al.  HPAT Indexing for Fast Object/Scene Recognition Based on Local Appearance , 2003, CIVR.

[16]  Wen Gao,et al.  Location Discriminative Vocabulary Coding for Mobile Landmark Search , 2011, International Journal of Computer Vision.

[17]  W. Eric L. Grimson,et al.  Learning Patterns of Activity Using Real-Time Tracking , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[18]  Bernd Girod,et al.  Transform coding of image feature descriptors , 2009, Electronic Imaging.

[19]  Bernd Girod,et al.  Compression of image patches for local feature extraction , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Cordelia Schmid,et al.  A Performance Evaluation of Local Descriptors , 2005, IEEE Trans. Pattern Anal. Mach. Intell..

[21]  David G. Lowe,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004, International Journal of Computer Vision.

[22]  Jan-Michael Frahm,et al.  From structure-from-motion point clouds to fast location recognition , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[23]  Andrzej Sluzek,et al.  Image-Based Information Guide on Mobile Devices , 2008, ISVC.

[24]  Avideh Zakhor,et al.  Location-based image retrieval for urban environments , 2011, 2011 18th IEEE International Conference on Image Processing.

[25]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Alexei A. Efros,et al.  IM2GPS: estimating geographic information from a single image , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, CVPR.

[28]  Wen Gao,et al.  A lowbit rate vocabulary coding scheme for mobile landmark search , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[29]  Jan-Michael Frahm,et al.  Modeling and Recognition of Landmark Image Collections Using Iconic Scene Graphs , 2008, International Journal of Computer Vision.

[30]  Yang Song,et al.  Tour the world: Building a web-scale landmark recognition engine , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[31]  Wen Gao,et al.  PKUBench: A context rich mobile visual search benchmark , 2011, 2011 18th IEEE International Conference on Image Processing.

[32]  Wei Zhang,et al.  Image Based Localization in Urban Environments , 2006, Third International Symposium on 3D Data Processing, Visualization, and Transmission (3DPVT'06).

[33]  Bernd Girod,et al.  Location coding for mobile image retrieval , 2009, MobiMedia.

[34]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[35]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[36]  Luc Van Gool,et al.  SURF: Speeded Up Robust Features , 2006, ECCV.

[37]  Jon M. Kleinberg,et al.  Mapping the world's photos , 2009, WWW '09.

[38]  Xing Xie,et al.  Location sensitive indexing for image-based advertising , 2009, MM '09.