Tap-to-search: Interactive and contextual visual search on mobile devices

Mobile visual search has been an emerging topic for both researching and industrial communities. Among various methods, visual search has its merit in providing an alternative solution, where text and voice searches are not applicable. This paper proposes an interactive “tap-to-search” approach utilizing both individual's intention in selecting interested regions via “tap” actions on the mobile touch screen, as well as a visual recognition by search mechanism in a large-scale image database. Automatic image segmentation technique is applied in order to provide region candidates. Visual vocabulary tree based search is adopted by incorporating rich contextual information which are collected from mobile sensors. The proposed approach has been conducted on an image dataset with the scale of two million. We demonstrated that using GPS contextual information, such an approach can further achieve satisfactory results with the standard information retrieval evaluation.

[1]  B. S. Manjunath,et al.  Color image segmentation , 1999, Proceedings. 1999 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (Cat. No PR00149).

[2]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[3]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[4]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[5]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[6]  Ramesh C. Jain,et al.  Content without context is meaningless , 2010, ACM Multimedia.

[7]  Yi Li,et al.  ARISTA - image search to annotation on billions of web photos , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Bernd Girod,et al.  Outdoors augmented reality on mobile phone using loxel-based visual feature organization , 2008, MIR '08.

[9]  Christopher D. Manning,et al.  Introduction to Information Retrieval , 2010, J. Assoc. Inf. Sci. Technol..

[10]  Tao Mei,et al.  When recommendation meets mobile: contextual and personalized recommendation on the go , 2011, UbiComp '11.

[11]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Bernd Girod,et al.  Mobile product recognition , 2010, ACM Multimedia.

[13]  Cheng-Hsin Hsu,et al.  Building book inventories using smartphones , 2010, ACM Multimedia.

[14]  B. S. Manjunath,et al.  Unsupervised Segmentation of Color-Texture Regions in Images and Video , 2001, IEEE Trans. Pattern Anal. Mach. Intell..