Local visual words coding for low bit rate mobile visual search

Mobile visual search has attracted extensive attention for its huge potential for numerous applications. Research on this topic has been focused on two schemes: sending query images, and sending compact descriptors extracted on mobile phones. The first scheme requires about 30-40KB data to transmit, while the second can reduce the bit rate by 10 times. In this paper, we propose a third scheme for extremely low bit rate mobile visual search, which sends compressed visual words consisting of vocabulary tree histogram and descriptor orientations rather than descriptors. This scheme can further reduce the bit rate with few extra computational costs on the client. Specifically, we store a vocabulary tree and extract visual descriptors on the mobile client. A light-weight pre-retrieval is performed to obtain the visited leaf nodes in the vocabulary tree. The orientation of each local descriptor and the tree histogram are then encoded to be transmitted to server. Our new scheme transmits less than 1KB data, which reduces the bit rate in the second scheme by 3 times, and obtains about 30% improvement in terms of search accuracy over the traditional Bag-of-Words baseline. The time cost is only 1.5 secs on the client and 240 msecs on the server.

[1]  Mark Nelson,et al.  The Data Compression Book, 2nd Edition , 1996 .

[2]  Bernd Girod,et al.  Transform coding of image feature descriptors , 2009, Electronic Imaging.

[3]  Mark Nelson,et al.  The data compression book (2nd ed.) , 1995 .

[4]  Mark Nelson,et al.  The Data Compression Book , 2009 .

[5]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[6]  Bernd Girod,et al.  Mobile product recognition , 2010, ACM Multimedia.

[7]  Binoy Pinto,et al.  Speeded Up Robust Features , 2011 .

[8]  Huizhong Chen,et al.  The stanford mobile visual search data set , 2011, MMSys.

[9]  Yang Wang,et al.  JIGSAW: interactive mobile visual search with multimodal queries , 2011, ACM Multimedia.

[10]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[11]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[12]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Wen Gao,et al.  Towards low bit rate mobile visual search with multiple-channel coding , 2011, ACM Multimedia.

[14]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.

[15]  Bernd Girod,et al.  Fast geometric re-ranking for image-based retrieval , 2010, 2010 IEEE International Conference on Image Processing.