The stanford mobile visual search data set

We survey popular data sets used in computer vision literature and point out their limitations for mobile visual search applications. To overcome many of the limitations, we propose the Stanford Mobile Visual Search data set. The data set contains camera-phone images of products, CDs, books, outdoor landmarks, business cards, text documents, museum paintings and video clips. The data set has several key characteristics lacking in existing data sets: rigid objects, widely varying lighting conditions, perspective distortion, foreground and background clutter, realistic ground-truth reference data, and query data collected from heterogeneous low and high-end camera phones. We hope that the data set will help push research forward in the field of mobile visual search.

[1]  Robert C. Bolles,et al.  Random sample consensus: a paradigm for model fitting with applications to image analysis and automated cartography , 1981, CACM.

[2]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[3]  Stepán Obdrzálek,et al.  Image Retrieval Using Local Compact DCT-Based Representation , 2003, DAGM-Symposium.

[4]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[5]  Stepán Obdrzálek,et al.  Sub-linear Indexing for Large Scale Object Recognition , 2005, BMVC.

[6]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[7]  Michael Isard,et al.  Object retrieval with large vocabularies and fast spatial matching , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[8]  Cordelia Schmid,et al.  Hamming Embedding and Weak Geometric Consistency for Large Scale Image Search , 2008, ECCV.

[9]  Bernd Girod,et al.  Outdoors augmented reality on mobile phone using loxel-based visual feature organization , 2008, MIR '08.

[10]  Berna Erol,et al.  HOTPAPER: multimedia interaction with paper using mobile phones , 2008, ACM Multimedia.

[11]  Li Fei-Fei,et al.  ImageNet: A large-scale hierarchical image database , 2009, CVPR.

[12]  Christopher Hunt,et al.  Notes on the OpenSURF Library , 2009 .

[13]  Huizhong Chen,et al.  Permutable descriptors for orientation-invariant image matching , 2010, Optical Engineering + Applications.

[14]  Bernd Girod,et al.  Dynamic selection of a feature-rich query frame for mobile video retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[15]  Andrea Vedaldi,et al.  Vlfeat: an open and portable library of computer vision algorithms , 2010, ACM Multimedia.

[16]  Bernd Girod,et al.  Compressed Histogram of Gradients: A Low-Bitrate Descriptor , 2011, International Journal of Computer Vision.

[17]  Bernd Girod,et al.  Mobile Visual Search , 2011, IEEE Signal Processing Magazine.