Distortion Measurement for Automatic Document Verification

Document forgery detection is important as techniques to generate forgeries are becoming widely available and easy to use even for untrained persons. In this work, two types of forgeries are considered: forgeries generated by re-engineering a document and forgeries that are generated using scanning and printing a genuine document. An unsupervised approach is presented to automatically detect forged documents of these types by detecting the geometric distortions introduced during the forgery process. Using the matching quality between all pairs of documents, outlier detection is performed on the summed matching quality to identify the tampered document. Quantitative evaluation is done on two public data sets, reporting a true positive rate from to 0.7 to 1.0.

[1]  Thomas M. Breuel A practical, globally optimal algorithm for geometric matching under uncertainty , 2001, Electron. Notes Theor. Comput. Sci..

[2]  R. Stephenson A and V , 1962, The British journal of ophthalmology.

[3]  Victoria J. Hodge,et al.  A Survey of Outlier Detection Methodologies , 2004, Artificial Intelligence Review.

[4]  Thomas M. Breuel,et al.  Automated OCR Ground Truth Generation , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[5]  Thomas M. Breuel,et al.  Using DCT Features for Printing Technique and Copy Detection , 2009, IFIP Int. Conf. Digital Forensics.

[6]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[7]  R. L. van Renesse,et al.  Paper based document security-a review , 1997 .

[8]  F. E. Grubbs Procedures for Detecting Outlying Observations in Samples , 1969 .

[9]  Thomas M. Breuel,et al.  Combined orientation and skew detection using geometric text-line modeling , 2009, International Journal on Document Analysis and Recognition (IJDAR).

[10]  Thomas M. Breuel,et al.  Document Signature Using Intrinsic Features for Counterfeit Detection , 2008, IWCF.

[11]  Jan P. Allebach,et al.  Printer identification based on graylevel co-occurrence features for security and forensic applications , 2005, IS&T/SPIE Electronic Imaging.

[12]  Thomas M. Breuel,et al.  Automatic counterfeit protection system code classification , 2010, Electronic Imaging.

[13]  Robert M. Haralick,et al.  Automatic generation of character groundtruth for scanned documents: a closed-loop approach , 1996, Proceedings of 13th International Conference on Pattern Recognition.

[14]  VARUN CHANDOLA,et al.  Outlier Detection : A Survey , 2007 .