A zero-watermarking algorithm for text documents based on structural components

Advances in communication technologies have made it easier to distribute and communicate information effectively. The world has become a global village with the advent of the Internet. Besides getting the benefits of information technology, the world faces threats of violations of copyright, digital counterfeiting, privacy and plagiarism. These issues have been addressed for image, audio and video by researchers in the past, but the protection of copyright for plain text has been neglected. In this paper, we have proposed a novel zero-watermarking algorithm for authentication and copyright protection of text documents. The algorithm is robust against content-preserving modifications and at the same time, is capable of detecting malicious tampering. The experimental results demonstrate the effectiveness of the algorithm against random tampering attacks by calculating normalized absolute error of original and extracted watermark. The results are also compared to recent work in this domain.

[1]  Lawrence O'Gorman,et al.  Electronic marking and identification techniques to discourage document copying , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[2]  Nicholas F. Maxemchuk,et al.  Electronic document distribution , 1994, AT&T Technical Journal.

[3]  Steven H. Low,et al.  Marking text documents , 1997, Proceedings of International Conference on Image Processing.

[4]  Steven H. Low,et al.  Document identification for copyright protection using centroid detection , 1998, IEEE Trans. Commun..

[5]  Steven H. Low,et al.  Performance comparison of two text marking methods , 1998, IEEE J. Sel. Areas Commun..

[6]  Steven H. Low,et al.  Copyright protection for the electronic distribution of text documents , 1999, Proc. IEEE.

[7]  Daigo Misaki,et al.  A feature calibration method for watermarking of document images , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[8]  S.H. Low,et al.  Capacity of text marking channel , 2000, IEEE Signal Processing Letters.

[9]  Mikhail J. Atallah,et al.  Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation , 2001, Information Hiding.

[10]  Hong Yan,et al.  Interword distance changes represented by sine waves for watermarking text images , 2001, IEEE Trans. Circuits Syst. Video Technol..

[11]  Sergei Nirenburg,et al.  Natural language processing for information assurance and security: an overview and implementations , 2001, NSPW '00.

[12]  Radu Sion,et al.  Natural Language Watermarking and Tamperproofing , 2002, Information Hiding.

[13]  M. Mrak,et al.  Picture quality measures in image compression systems , 2003, The IEEE Region 8 EUROCON 2003. Computer as a Tool..

[14]  E. Delp,et al.  Security, steganography, and watermarking of multimedia contents , 2004 .

[15]  Edward J. Delp,et al.  Natural language watermarking , 2005, IS&T/SPIE Electronic Imaging.

[16]  Pheng-Ann Heng,et al.  Face Recognition Based on Generalized Canonical Correlation Analysis , 2005, ICIC.

[17]  Xingming Sun,et al.  Noun-Verb Based Technique of Text Watermarking Using Recursive Decent Semantic Net Parsers , 2005, ICNC.

[18]  Mikhail J. Atallah,et al.  The hiding virtues of ambiguity: quantifiably resilient watermarking of natural language text through synonym substitutions , 2006, MM&Sec '06.

[19]  Bülent Sankur,et al.  Syntactic tools for text watermarking , 2007, Electronic Imaging.

[20]  Mikhail J. Atallah,et al.  Information hiding through errors: a confusing approach , 2007, Electronic Imaging.

[21]  Benoit M. Macq,et al.  A method of text watermarking using presuppositions , 2007, Electronic Imaging.

[22]  Junzhong Gu,et al.  An Optimized Natural Language Watermarking Algorithm based on TMR , 2008, 2008 The 9th International Conference for Young Computer Scientists.

[23]  Bülent Sankur,et al.  Natural language watermarking via morphosyntactic alterations , 2009, Comput. Speech Lang..

[24]  Anwar M. Mirza,et al.  A NOVEL TEXT WATERMARKING ALGORITHM USING IMAGE WATERMARK , 2010 .