Interframe Coding of Global Image Signatures for Mobile Augmented Reality

For mobile augmented reality, an image captured by a mobile device's camera is often compared against a database hosted on a remote server to recognize objects in the image. It is critically important that the amount of data transmitted over the network is as small as possible to reduce the system latency. A low bitrate global signature for still images has been previously shown to achieve high-accuracy image retrieval. In this paper, we develop new methods for interframe coding of a continuous stream of global signatures that can reduce the bitrate by nearly two orders of magnitude compared to independent coding of these global signatures, while achieving the same or better image retrieval accuracy. The global signatures are constructed in an embedded data structure that offers rate scalability. The usage of these new coding methods and the embedded data structure allows the streaming of high-quality global signatures at a bitrate that is less than 2 kbps. Furthermore, a statistical analysis of the retrieval and coding performance is performed to understand the trade off between bitrate and image retrieval accuracy and explain why interframe coding of global signatures substantially outperforms independent coding.

[1]  Bernd Girod,et al.  Compressed Histogram of Gradients: A Low-Bitrate Descriptor , 2011, International Journal of Computer Vision.

[2]  Luc Van Gool,et al.  Speeded-Up Robust Features (SURF) , 2008, Comput. Vis. Image Underst..

[3]  Zvi Drezner,et al.  A generalized binomial distribution , 1993 .

[4]  Bernd Girod,et al.  Rotation-invariant fast features for large-scale recognition and real-time tracking , 2013, Signal Process. Image Commun..

[5]  Bernd Girod,et al.  Mobile Visual Search: Architectures, Technologies, and the Emerging MPEG Standard , 2011, IEEE MultiMedia.

[6]  Bernd Girod,et al.  Residual enhanced visual vector as a compact signature for mobile visual search , 2013, Signal Process..

[7]  Bernd Girod,et al.  Streaming mobile augmented reality on mobile phones , 2009, 2009 8th IEEE International Symposium on Mixed and Augmented Reality.

[8]  Wen Gao,et al.  Compact Descriptors for Visual Search , 2014, IEEE MultiMedia.

[9]  Bernd Girod,et al.  Interframe Coding of Canonical Patches for Mobile Augmented Reality , 2012, 2012 IEEE International Symposium on Multimedia.

[10]  Massimo Balestri,et al.  Selection of local features for visual search , 2013, Signal Process. Image Commun..

[11]  Cordelia Schmid,et al.  Aggregating Local Image Descriptors into Compact Codes , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Andrew Zisserman,et al.  Video Google: a text retrieval approach to object matching in videos , 2003, Proceedings Ninth IEEE International Conference on Computer Vision.

[13]  Bernd Girod,et al.  Interframe Coding of Canonical patches for Low Bit-rate Mobile Augmented Reality , 2013, Int. J. Semantic Comput..

[14]  David Nistér,et al.  Scalable Recognition with a Vocabulary Tree , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[15]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .