Mobile devices are ubiquitous. People use their phones as a personal concierge not only discovering information but also searching for particular interest on-the-go and making decisions. This brings a new horizon for multimedia retrieval on mobile. While existing efforts have predominantly focused on understanding textual or a voice query, this paper presents a new perspective which understands visual queries captured by the built-in camera such that mobile-based social activities can be recommended for users to complete. In this work, a query image-based contextual model is proposed for visual search. A mobile user can take a photo and naturally indicate an object-of-interest within the photo via circle based gesture called “O” gesture. Both selected object-of-interest region as well as surrounding visual context in photo are used in achieving a search-based recognition by retrieving similar images based on a large-scale of visual vocabulary tree. Consequently, social activities such as visiting contextually relevant entities (i.e., local businesses) are recommended to the users based on their visual queries and GPS location. Along with the proposed method, an exemplary real application has been developed on Windows Phone 7 devices and evaluated with a wide variety of scenarios on million-scale image database. To test the performance of proposed mobile visual search model, extensive experimentation has been conducted and compared with state-of-the-art algorithms in content-based image retrieval (CBIR) domain.
[1]
Hugo Zaragoza,et al.
The Probabilistic Relevance Framework: BM25 and Beyond
,
2009,
Found. Trends Inf. Retr..
[2]
Ning Zhang,et al.
Tap-to-search: Interactive and contextual visual search on mobile devices
,
2011,
2011 IEEE 13th International Workshop on Multimedia Signal Processing.
[3]
John R. Smith.
Clicking on Things
,
2010,
IEEE Multim..
[4]
Bernd Girod,et al.
Location coding for mobile image retrieval
,
2009,
MobiMedia.
[5]
David Nistér,et al.
Scalable Recognition with a Vocabulary Tree
,
2006,
2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).
[6]
Bernd Girod,et al.
Mobile Visual Search
,
2011,
IEEE Signal Processing Magazine.
[7]
Xian-Sheng Hua,et al.
Contextual image retrieval model
,
2010,
CIVR '10.
[8]
Anas Al-Nuaimi,et al.
Mobile Visual Location Recognition
,
2013
.
[9]
Bernd Girod,et al.
CHoG: Compressed histogram of gradients A low bit-rate feature descriptor
,
2009,
CVPR.