A Hybrid Intelligent Text Watermarking and Natural Language Processing Approach for Transferring and Receiving an Authentic English Text Via Internet

Due to the rapid increase in the exchange of text information via internet networks, the security and the reliability of digital content have become a major research issue. The main challenges faced by researchers are authentication, integrity verification, and tampering detection of the digital contents. In this paper, a Robust English Text Watermarking and Natural Language Processing Approach (RETWNLPA) is proposed based on word mechanism and first level order of Markov model to improve the accuracy of tampering detection of sensitive English text. The RETWNLPA approach embeds and detects the watermark logically without altering the original text document. Based on the hidden Markov model (HMM), the first-level order of word mechanism is used to analyze the interrelationship between English text. The extracted features are used as watermark information and integrated with text zero-watermarking techniques. To detect eventual tampering, RETWNLPA has been implemented and validated with attacked English text. Experiments were performed on four datasets of varying sizes under random locations of common tampering attacks. The simulation results prove the tampering detection accuracy of our method against all kinds of tampering attacks. Comparison results show that RETWNLPA outperforms baseline approaches HNLPZWA (an intelligent hybrid of natural language processing and zero-watermarking approach) and ZWAFWMMM (Zero-Watermarking Approach based on Fourth level order of Word Mechanism of Markov Model) in terms of tampering detection accuracy.

[1]  Yuexin Zhang,et al.  A text zero-watermarking algorithm based on Chinese phonetic alphabets , 2016, Wuhan University Journal of Natural Sciences.

[2]  Prabhishek Singh,et al.  A Survey of Digital Watermarking Techniques, Applications and Attacks , 2013 .

[4]  Khalid Mahmood,et al.  A zero watermarking approach for content authentication and tampering detection of Arabic text based on fourth level order and word mechanism of Markov model , 2020, J. Inf. Secur. Appl..

[5]  O. Tayan,et al.  A Hybrid Digital-Signature and Zero-Watermarking Approach for Authentication and Protection of Sensitive Electronic Documents , 2014, TheScientificWorldJournal.

[6]  Jianping Chen,et al.  Text watermarking algorithm based on semantic role labeling , 2016, 2016 Third International Conference on Digital Information Processing, Data Mining, and Wireless Communications (DIPDMWC).

[7]  Mohamed Elhoseny,et al.  Dual watermarking framework for privacy protection and content authentication of multimedia , 2019, Future Gener. Comput. Syst..

[8]  Wan Azizun Wan Adnan,et al.  Robust Digital Text Watermarking Algorithm based on Unicode Extended Characters , 2016 .

[9]  Lip Yee Por,et al.  A Review of Text Watermarking: Theory, Methods, and Applications , 2018, IEEE Access.

[10]  Nasr addin Ahmed Salem Al-maweri,et al.  State-of-the-Art in Techniques of Text Digital Watermarking: Challenges and Limitations , 2016, J. Comput. Sci..

[11]  Manpreet Kaur,et al.  Encryption based LSB Steganography Technique for Digital Images and Text Data , 2016 .

[12]  Yuling Liu,et al.  A zero-watermarking algorithm based on merging features of sentences for Chinese text , 2015 .

[13]  Mansoor Ahmed,et al.  Towards a formally verified zero watermarking scheme for data integrity in the Internet of Things based-wireless sensor networks , 2017, Future Gener. Comput. Syst..

[14]  MOHAMMED MAHDI HASHIM,et al.  A REVIEW AND OPEN ISSUES OF DIVERSE TEXT WATERMARKING TECHNIQUES IN SPATIAL DOMAIN , 2018 .

[15]  Boutouhami Khaoula,et al.  OPTIMISTIC DECISION MAKING USING AN APPROXIMATE GRAPHICAL MODEL , 2015 .

[16]  Chih-Ting Kuo,et al.  Authorization Identification by Watermarking in Log-polar Coordinate System , 2018, Comput. J..

[17]  Fahd N. Al-Wesabi,et al.  Combined Markov Model and Zero Watermarking Techniques to Enhance Content Authentication of English Text Documents , 2014 .

[18]  Lamiaa A. Elrefaei,et al.  Improved capacity Arabic text watermarking methods based on open word space , 2017, J. King Saud Univ. Comput. Inf. Sci..

[19]  Amjad Rehman,et al.  Replacement Attack: A New Zero Text Watermarking Attack , 2017 .

[20]  Yongzhao Zhan,et al.  Maximum Neighborhood Margin Discriminant Projection for Classification , 2014, TheScientificWorldJournal.

[21]  Hanaa Mohsin Ahmed,et al.  Comparison of Eight Proposed Security Methods using Linguistic Steganography Text , 2016 .

[22]  Fei Peng,et al.  Image tamper detection based on noise estimation and lacunarity texture , 2015, Multimedia Tools and Applications.

[23]  Mohd Yamani Idna Idris,et al.  Approaches for preserving content integrity of sensitive online Arabic content: A survey and research challenges , 2019, Inf. Process. Manag..

[24]  M. Shamim Hossain,et al.  New Zero-Watermarking Algorithm Using Hurst Exponent for Protection of Privacy in Telemedicine , 2018, IEEE Access.

[25]  Manmeet Kaur,et al.  An Existential Review on Text Watermarking Techniques , 2015 .

[26]  Lamiaa A. Elrefaei,et al.  Arabic Text Watermarking: A Review , 2015, ArXiv.

[27]  Asadullah Shah,et al.  A Novel Text Steganography Technique to Arabic Language Using Reverse Fat5Th5Ta , 2015 .

[28]  Wenzhong Shi,et al.  High-Capacity and Robust Watermarking Scheme for Small-Scale Vector Data , 2019, KSII Trans. Internet Inf. Syst..

[29]  Swati Dhiman,et al.  Analysis of Visible and Invisible Image Watermarking – A Review , 2016 .

[30]  Jae Jung,et al.  User Authentication System based on Baseline-corrected ECG for Biometrics , 2018 .