Mobile visual search via hievarchical sparse coding

Mobile visual search is attracting much research attention recently. Existing works focus on addressing the limited capacity of wireless channel yet overlook its instability, thus is not adaptive to the change of channel capacity. In this paper, a novel image retrieval algorithm that is scalable to various channel condition is proposed. The proposed algorithm contains three contributions: (1) to achieve instant retrieval under various channel capacity, we adjust transmission load by sparseness instead of codebook size; (2) we introduce hierarchical sparse coding into our retrieval workflow, where original codebook is transformed into a tree-structured dictionary which implies elements' priority; (3) we propose transmission priority ranking schemes that is adaptive to specific query. Experiment results show that the proposed algorithm outperforms BoW and Lasso based algorithm under different parameter settings. Retrieval results under different channel limitation validate the scalability of our method.

[1]  Xuelong Li,et al.  QUC-Tree: Integrating Query Context Information for Efficient Music Retrieval , 2009, IEEE Transactions on Multimedia.

[2]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[3]  Jing Ren,et al.  Building a Large Scale Test Collection for Effective Benchmarking of Mobile Landmark Search , 2013, MMM.

[4]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[5]  Wen Gao,et al.  Learning multiple codebooks for low bit rate mobile visual search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[6]  Falk Scholer,et al.  User performance versus precision measures for simple search tasks , 2006, SIGIR.

[7]  Yuan Yan Tang,et al.  GPS Estimation from Users' Photos , 2013, MMM.

[8]  Wen Gao,et al.  Pruning tree-structured vector quantizer towards low bit rate mobile visual search , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Qi Tian,et al.  Task-Dependent Visual-Codebook Compression , 2012, IEEE Transactions on Image Processing.

[10]  Julien Mairal,et al.  Proximal Methods for Sparse Hierarchical Dictionary Learning , 2010, ICML.

[11]  Bernd Girod,et al.  CHoG: Compressed histogram of gradients A low bit-rate feature descriptor , 2009, CVPR.

[12]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[13]  Wen Gao,et al.  Location Discriminative Vocabulary Coding for Mobile Landmark Search , 2011, International Journal of Computer Vision.

[14]  K. K. More,et al.  Interactive Multimodal Visual Search on Mobile Device , 2015 .

[15]  Yuan Yan Tang,et al.  GPS Estimation for Places of Interest From Social Users' Uploaded Photos , 2013, IEEE Transactions on Multimedia.

[16]  Guillermo Sapiro,et al.  Online dictionary learning for sparse coding , 2009, ICML '09.

[17]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[18]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[19]  Xueming Qian,et al.  Mobile image retrieval using multi-photos as query , 2013, 2013 IEEE International Conference on Multimedia and Expo Workshops (ICMEW).