Digital Watermarking Technique for Text Document Protection Using Data Mining Analysis

In the current era, information security is on its top priority for all organizations. The individuals, government officials, and military with the rapid development of Internet technologies like the Internet of Things (IoT), big data, and cloud computing facing data security problems. As the massive rate of data growth, it is a challenging task for the researchers, that how to manage the vast amount of data safely and effectively while designing smart cities. It has been quite easy to produce an illegal copy of digital contents. The verification of digital content is one of the major issues because digital contents are generated daily and shared via the internet. The limited techniques are available for document copyright protection. However, most of the existing techniques produce distortion during watermark insertion or lack of capacity. In the said perspective, a digital watermarking technique is proposed for document copyright protection and ownership verification with the help of data mining. The techniques of data mining are applied to find suitable properties from the document for embedding watermark. The proposed model provides copyright protection to text documents on local and cloud computing paradigm. For the evaluation of the proposed technique, 20 different text documents are used to perform many attacks such as formatting, insertion, and deletion attacks. The proposed technique attained a high-level of imperceptibility where peak signal to noise ratio (PSNR) values are between 64.67% and 71.03%, and similarity (SIM) percentage is between 99.92% and 99.99%. The proposed technique is robust and resists from formatting attacks and capacity of the proposed technique is also improved as compared to the previous techniques.

[1]  Dilek Z. Hakkani-Tür,et al.  Natural language watermarking: challenges in building a practical system , 2006, Electronic Imaging.

[2]  Danilo Montesi,et al.  Content-preserving Text Watermarking through Unicode Homoglyph Substitution , 2016, IDEAS.

[3]  Yanyan Huang,et al.  A Comparative Analysis of Information Hiding Techniques for Copyright Protection of Text Documents , 2018, Secur. Commun. Networks.

[4]  Mohammed Ghanbari,et al.  Scope of validity of PSNR in image/video quality assessment , 2008 .

[5]  Steven H. Low,et al.  Copyright protection for the electronic distribution of text documents , 1999, Proc. IEEE.

[6]  Meng Yingjie,et al.  A Zero-Watermarking Scheme for Prose Writings , 2017, 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC).

[7]  Alex ChiChung Kot,et al.  Text document authentication by integrating inter character and word spaces watermarking , 2004, 2004 IEEE International Conference on Multimedia and Expo (ICME) (IEEE Cat. No.04TH8763).

[8]  Ms. Mahua Pal,et al.  A Survey on Digital Watermarking and its Application , 2016 .

[9]  Yasser M. Alginahi,et al.  An enhanced Kashida-based watermarking approach for Arabic text-documents , 2013, 2013 International Conference on Electronics, Computer and Computation (ICECCO).

[10]  Nighat Mir,et al.  Copyright for web content using invisible text watermarking , 2014, Comput. Hum. Behav..

[11]  Ahmed Khan Information Hiding in Text to Improve Performance for Word Document , 2015 .

[12]  Hui Feng,et al.  A robust text digital watermarking algorithm based on fragments regrouping strategy , 2010, 2010 IEEE International Conference on Information Theory and Information Security.

[13]  Adnan Abdul-Aziz Gutub,et al.  e-Text Watermarking: Utilizing 'Kashida' Extensions in Arabic Language Electronic Writing , 2010 .

[14]  Steven J. Miller,et al.  Benford’s law and continuous dependent random variables , 2013, 1309.5603.

[15]  Jack Brassil Hiding Information in Document Images , 2007 .

[16]  Nobuo Funabiki,et al.  Data Hiding for Text Document in PDF File , 2017, IIH-MSP.

[17]  Mazen M. Selim,et al.  A high capacity algorithm for information hiding in Arabic text , 2020, J. King Saud Univ. Comput. Inf. Sci..

[18]  Chao Liu,et al.  A New Digital Watermarking Method for Data Integrity Protection in the Perception Layer of IoT , 2017, Secur. Commun. Networks.

[19]  Kyung-Ae Moon,et al.  A text watermarking algorithm based on word classification and inter-word space statistics , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[20]  Muhammad Zeeshan,et al.  A Review Study on Unique Way of Information Hiding: Steganography , 2017 .

[21]  Lawrence O'Gorman,et al.  Electronic marking and identification techniques to discourage document copying , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[22]  Wan Azizun Wan Adnan,et al.  Robust Digital Text Watermarking Algorithm based on Unicode Extended Characters , 2016 .

[23]  Jyoti,et al.  Survey on Digital Watermarking , 2014 .

[24]  Muhammad Nomani Kabir,et al.  An enhanced Kashida-based watermarking approach for increased protection in Arabic text-documents based on frequency recurrence of characters , 2014 .

[25]  Vahab Iranmanesh,et al.  Information hiding using whitespace technique in Microsoft word , 2016, 2016 22nd International Conference on Virtual System & Multimedia (VSMM).

[26]  Peng Li,et al.  Two Zero-Watermark methods for XML documents , 2016, Journal of Real-Time Image Processing.

[27]  Cheng Zhang,et al.  FontCode: Embedding Information in Text Documents using Glyph Perturbation , 2017, ACM Trans. Graph..

[28]  Alexander Shelupanov,et al.  The cybersecurity in development of IoT embedded technologies , 2017, 2017 International Conference on Information Science and Communications Technologies (ICISCT).

[29]  KokSheik Wong,et al.  UniSpaCh: A text-based data hiding method using Unicode space characters , 2012, J. Syst. Softw..

[30]  Yuling Liu,et al.  A zero-watermarking algorithm based on merging features of sentences for Chinese text , 2015 .

[31]  Soohyung Kim,et al.  Managing IoT devices using blockchain platform , 2017, 2017 19th International Conference on Advanced Communication Technology (ICACT).

[32]  N. N. Patil,et al.  Implementation of a new technique for web document protection using unicode , 2013, 2013 International Conference on Information Communication and Embedded Systems (ICICES).

[33]  Zunera Jalil Copyright Protection of Plain Text Using Digital Watermarking , 2010 .

[34]  Tao Kong,et al.  A novel robust text watermarking for word document , 2010, 2010 3rd International Congress on Image and Signal Processing.

[35]  Fahd N. Al-Wesabi,et al.  Content Authentication of English Text via Internet using Zero Watermarking Technique and Markov Model , 2014 .

[36]  Weijin Jiang,et al.  Print-scan invariant text image watermarking for hardcopy document authentication , 2018, Multimedia Tools and Applications.

[37]  Ala Hamarsheh,et al.  AH4S: an algorithm of text in text steganography using the structure of omega network , 2016, Secur. Commun. Networks.

[38]  Thumrongrat Amornraksa,et al.  Digital image watermarking for printed and scanned documents , 2017, International Conference on Digital Image Processing.

[39]  Nasr addin Ahmed Salem Al-maweri,et al.  State-of-the-Art in Techniques of Text Digital Watermarking: Challenges and Limitations , 2016, J. Comput. Sci..

[40]  George Suciu,et al.  Big Data, Internet of Things and Cloud Convergence – An Architecture for Secure E-Health Applications , 2015, Journal of Medical Systems.

[41]  Tao Guo,et al.  Chinese Text Zero-Watermark Based on Sentence's Entropy , 2010, 2010 International Conference on Multimedia Technology.

[42]  Elisa Bertino,et al.  On the Properties of Non-Media Digital Watermarking: A Review of State of the Art Techniques , 2016, IEEE Access.

[43]  O. Tayan,et al.  A Hybrid Digital-Signature and Zero-Watermarking Approach for Authentication and Protection of Sensitive Electronic Documents , 2014, TheScientificWorldJournal.

[44]  Zunera Jalil,et al.  Text Watermarking Using Combined Image-plus-Text Watermark , 2010, 2010 Second International Workshop on Education Technology and Computer Science.

[45]  Manmeet Kaur,et al.  An Existential Review on Text Watermarking Techniques , 2015 .

[46]  Tao Guo,et al.  Chinese Text Zero-Watermark Based on Space Model , 2011, 2011 3rd International Workshop on Intelligent Systems and Applications.

[47]  Lamiaa A. Elrefaei,et al.  Improved capacity Arabic text watermarking methods based on open word space , 2017, J. King Saud Univ. Comput. Inf. Sci..