Word length based zero-watermarking algorithm for tamper detection in text documents

Copyright protection and authentication of digital content has become a major concern in the current digital era. Plain text is the widely used means of information exchange on the Internet and it is essential to verify the authenticity of information in any form of communication. There are very limited techniques available for plain text watermarking, authentication, and tamper detection. This paper presents a novel zero-watermarking algorithm for tamper detection in plain text documents. The algorithm generates a watermark based on the text contents which can be extracted later using extraction algorithm to identify the status of tampering in the text document. Experimental results demonstrate the effectiveness of the algorithm against random tampering attacks. Watermark pattern matching and watermark distortion rate are used as evaluation parameters on multiple text samples of varying length.

[1]  E. Delp,et al.  Security, steganography, and watermarking of multimedia contents , 2004 .

[2]  Xingming Sun,et al.  Noun-Verb Based Technique of Text Watermarking Using Recursive Decent Semantic Net Parsers , 2005, ICNC.

[3]  Pheng-Ann Heng,et al.  Face Recognition Based on Generalized Canonical Correlation Analysis , 2005, ICIC.

[4]  Lawrence O'Gorman,et al.  Electronic marking and identification techniques to discourage document copying , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[5]  Sergei Nirenburg,et al.  Natural language processing for information assurance and security: an overview and implementations , 2001, NSPW '00.

[6]  Nicholas F. Maxemchuk,et al.  Electronic document distribution , 1994, AT&T Technical Journal.

[7]  Steven H. Low,et al.  Copyright protection for the electronic distribution of text documents , 1999, Proc. IEEE.

[8]  Benoit M. Macq,et al.  A method of text watermarking using presuppositions , 2007, Electronic Imaging.

[9]  Bülent Sankur,et al.  Natural language watermarking via morphosyntactic alterations , 2009, Comput. Speech Lang..

[10]  Radu Sion,et al.  Natural Language Watermarking and Tamperproofing , 2002, Information Hiding.

[11]  Steven H. Low,et al.  Document identification for copyright protection using centroid detection , 1998, IEEE Trans. Commun..

[12]  Mikhail J. Atallah,et al.  Information hiding through errors: a confusing approach , 2007, Electronic Imaging.

[13]  R. Reulke,et al.  Remote Sensing and Spatial Information Sciences , 2005 .

[14]  Asifullah Khan,et al.  OPTIMIZING PERCEPTUAL SHAPING OF A DIGITAL WATERMARK USING GENETIC PROGRAMMING , 2004 .

[15]  Mikhail J. Atallah,et al.  The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions , 2006, MM&Sec '06.

[16]  A. Li,et al.  AUTHENTICATION OF GIS VECTOR DATA BASED ON ZERO-WATERMARKING , 2008 .

[17]  Steven H. Low,et al.  Marking text documents , 1997, Proceedings of International Conference on Image Processing.

[18]  Hong Yan,et al.  Interword distance changes represented by sine waves for watermarking text images , 2001, IEEE Trans. Circuits Syst. Video Technol..

[19]  Daigo Misaki,et al.  A feature calibration method for watermarking of document images , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[20]  Mikhail J. Atallah,et al.  Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation , 2001, Information Hiding.

[21]  Weidong Zhao,et al.  Security Theory and Attack Analysis for Text Watermarking , 2009, 2009 International Conference on E-Business and Information System Security.

[22]  Steven H. Low,et al.  Performance comparison of two text marking methods , 1998, IEEE J. Sel. Areas Commun..

[23]  Junzhong Gu,et al.  An Optimized Natural Language Watermarking Algorithm based on TMR , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[24]  Edward J. Delp,et al.  Natural language watermarking , 2005, IS&T/SPIE Electronic Imaging.

[25]  S.H. Low,et al.  Capacity of text marking channel , 2000, IEEE Signal Processing Letters.

[26]  Bülent Sankur,et al.  Syntactic tools for text watermarking , 2007, Electronic Imaging.