Residual Enhanced Visual Vectors for on-device image matching

Most mobile visual search (MVS) systems query a large database stored on a server. This paper presents a new architecture for searching a large database directly on a mobile device, which has numerous benefits for network-independent, low-latency, and privacy-protected image retrieval. A key challenge for on-device MVS is storing a memory-intensive database in the limited RAM of the mobile device. We design and implement a new compact global image signature called the Residual Enhanced Visual Vector (REVV) that is optimized for the local features typically used in MVS. REVV outperforms existing compact database representations in the MVS setting and attains similar retrieval accuracy in large-scale retrieval tests as a Vocabulary Tree that uses 26× more memory. The compactness of REVV consequently enables many database images to be queried on a mobile device.

[1]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[2]  Michael Isard,et al.  Lost in quantization: Improving particular object retrieval in large scale image databases , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Huizhong Chen,et al.  Robust text detection in natural images with edge-enhanced Maximally Stable Extremal Regions , 2011, 2011 18th IEEE International Conference on Image Processing.

[4]  Richard Szeliski,et al.  City-Scale Location Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Bernd Girod,et al.  Inverted Index Compression for Scalable Image Matching , 2010, 2010 Data Compression Conference.

[6]  Cordelia Schmid,et al.  Aggregating local descriptors into a compact image representation , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[7]  Bernd Girod,et al.  Mobile product recognition , 2010, ACM Multimedia.

[8]  Gang Hua,et al.  Discriminative Learning of Local Image Descriptors , 1990, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Cordelia Schmid,et al.  Improving Bag-of-Features for Large Scale Image Search , 2010, International Journal of Computer Vision.

[10]  Bernd Girod,et al.  Unified Real-Time Tracking and Recognition with Rotation-Invariant Fast Features , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11]  Huizhong Chen,et al.  The stanford mobile visual search data set , 2011, MMSys.

[12]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[13]  Bernd Girod,et al.  Outdoors augmented reality on mobile phone using loxel-based visual feature organization , 2008, MIR '08.

[14]  Bernd Girod,et al.  Tree Histogram Coding for Mobile Image Matching , 2009, 2009 Data Compression Conference.

[15]  Binoy Pinto,et al.  Speeded Up Robust Features , 2011 .

[16]  Bernd Girod,et al.  Dynamic selection of a feature-rich query frame for mobile video retrieval , 2010, 2010 IEEE International Conference on Image Processing.

[17]  Bernd Girod,et al.  Quantization schemes for low bitrate Compressed Histogram of Gradients descriptors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition - Workshops.

[18]  Florent Perronnin,et al.  Large-scale image retrieval with compressed Fisher vectors , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[19]  Xiao Zhang,et al.  Efficient indexing for large scale visual search , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[20]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[21]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[22]  Huizhong Chen,et al.  Mobile visual search on printed documents using text and low bit-rate features , 2011, 2011 18th IEEE International Conference on Image Processing.