A decision theoretic framework for analyzing binary hash-based content identification systems

Content identification has many applications, ranging from preventing illegal sharing of copyrighted content on video sharing websites, to automatic identification and tagging of content. Several content identification techniques based on watermarking or robust hashes have been proposed in the literature, but they have mostly been evaluated through experiments. This paper analyzes binary hash-based content identification schemes under a decision theoretic framework and presents a lower bound on the length of the hash required to correctly identify multimedia content that may have undergone modifications. A practical scheme for content identification is evaluated under the proposed framework. The results obtained through experiments agree very well with the performance suggested by the theoretical analysis.

[1]  Michael Isard,et al.  General Theory , 1969 .

[2]  M. Barni,et al.  Data hiding for fighting piracy , 2004, IEEE Signal Processing Magazine.

[3]  Piotr Indyk,et al.  Similarity Search in High Dimensions via Hashing , 1999, VLDB.

[4]  Ton Kalker,et al.  Feature Extraction and a Database Strategy for Video Fingerprinting , 2002, VISUAL.

[5]  Shumeet Baluja,et al.  Content Fingerprinting Using Wavelets , 2006 .

[6]  H. Vincent Poor,et al.  An Introduction to Signal Detection and Estimation , 1994, Springer Texts in Electrical Engineering.

[7]  H. Vincent Poor,et al.  An introduction to signal detection and estimation (2nd ed.) , 1994 .

[8]  David Salesin,et al.  Fast multiresolution image querying , 1995, SIGGRAPH.

[9]  Thierry Pun,et al.  Robust perceptual hashing as classification problem: decision-theoretic and practical considerations , 2007, 2007 IEEE 9th Workshop on Multimedia Signal Processing.

[10]  Edith Cohen,et al.  Finding interesting associations without support pruning , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[11]  Nasir D. Memon,et al.  Spatio–Temporal Transform Based Video Hashing , 2006, IEEE Transactions on Multimedia.

[12]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Ton Kalker,et al.  A Highly Robust Audio Fingerprinting System , 2002, ISMIR.

[14]  Juan R. Hern,et al.  Statistical Analysis of Watermarking Schemes for Copyright Protection of Images , 1999 .

[15]  Neil J. Hurley,et al.  A framework for soft hashing and its application to robust image hashing , 2004, 2004 International Conference on Image Processing, 2004. ICIP '04..

[16]  S. Mallat A wavelet tour of signal processing , 1998 .

[17]  Ramarathnam Venkatesan,et al.  New Iterative Geometric Methods for Robust Perceptual Image Hashing , 2001, Digital Rights Management Workshop.