论文信息 - A Statistical Algorithm for Linguistic Steganography Detection Based on Distribution of Words

A Statistical Algorithm for Linguistic Steganography Detection Based on Distribution of Words

In this paper, a novel statistical algorithm for linguistic steganography detection, which takes advantage of distribution of words in the text segment detected, is presented. Linguistic steganography is the art of using written natural language to hide the very presence of secret messages. Using the text data, which is the foundational media in Internet communications, as its carrier, linguistic steganography plays an important part in Information Hiding (IH) area. The previous work was mainly focused on linguistic steganography and there were few researches on linguistic steganalisys. We attempt to do something to help to fix this gap. In our experiment of detecting the three different linguistic steganography methods: NICETEXT, TEXTO and Markov-chain-Based, the total accuracies on discovering stego-text segments and normal text segments are found to be 87.39% 95.51%, 98.50%, 99.15% and 99.57% respectively when the segment size is 5 kB, WkB, 20 kB, 30 kB and 40 kB. Our research shows that the linguistic steganalysis based on distribution of words is promising.

Huang Liusheng | Yang Wei | Chen Zhili | Yu Zhen-shan | Li Lingjun

[1] Marc Rennhard,et al. A Practical and Effective Approach to Large-Scale Automated Linguistic Steganography , 2001, ISC.

[2] Yang Yi-xian. Research on the detecting algorithm of text document information hiding , 2004 .

[3] Hao,et al. Research on Information Hiding , 2006 .

[4] Chih-Jen Lin,et al. A Practical Guide to Support Vector Classication , 2008 .

[5] Edward J. Delp,et al. Attacks on lexical natural language steganography systems , 2006, Electronic Imaging.