Perceptual hashing for hardcopy document authentication using morphological segmentation

Semi-fragile authentication of hardcopy documents is a technique designed to detect any visually signi cant alteration in a document, while ignoring incidental alterations, like distortions resulting from print-scan operations, photocopies, rotations, scalings, translations and minor stains on the paper. It is meant to substitute the use of notarial authenticated photocopies. However, to our knowledge, there is still no functional authentication system for printed documents, only for documents in digital form [1]. A semi-fragile authentication system is composed of three sub-components: perceptual hashing, cryptography and data hiding. This work is concerned with the rst sub-component. The perceptual image hashing h(A) of an image A is a value that identi es A. It is also called robust visual hashing or media hashing [2], [3], [4]. Moreover, given two images A and B, the distance D[h(A), h(B)] between the hashings must be somehow proportional to the perceptual visual di erence of the images A and B. To our knowledge, no perceptual hashing has been proposed for document authentication and perceptual hashings for continuous-tone images cannot be directly applied to authenticate documents. Behera et al. [5] have proposed a perceptual hashing that uses low resolution image of the document as the input data for document retrieval. However, it cannot be used for document authentication, because authentication must detect even the alteration of a single character and must use high-resolution images. This on-going work intend to propose a perceptual hashing for the document authentication.

[1]  Hae Yong Kim,et al.  New Public-Key Authentication Watermarking for JBIG2 Resistant to Parity Attacks , 2005, IWDW.

[2]  Shih-Fu Chang,et al.  A robust content based digital signature for image authentication , 1996, Proceedings of 3rd IEEE International Conference on Image Processing.

[3]  Sargur N. Srihari Document Image Understanding , 1986, FJCC.

[4]  Hae Yong Kim,et al.  Data Hiding for Binary Documents Robust to Print-Scan, Photocopy and Geometric Distortions , 2007, XX Brazilian Symposium on Computer Graphics and Image Processing (SIBGRAPI 2007).

[5]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[6]  Denis Lalanne,et al.  Visual signature based identification of Low-resolution document images , 2004, DocEng '04.

[7]  Ton Kalker,et al.  Visual hashing of digital video: applications and techniques , 2001, Optics + Photonics.

[8]  Vishal Monga,et al.  Perceptual Image Hashing Via Feature Points: Performance Evaluation and Tradeoffs , 2006, IEEE Transactions on Image Processing.