The syntax of concealment: reliable methods for plain text information hiding

Many plain text information hiding techniques demand deep semantic processing, and so suffer in reliability. In contrast, syntactic processing is a more mature and reliable technology. Assuming a perfect parser, this paper evaluates a set of automated and reversible syntactic transforms that can hide information in plain text without changing the meaning or style of a document. A large representative collection of newspaper text is fed through a prototype system. In contrast to previous work, the output is subjected to human testing to verify that the text has not been significantly compromised by the information hiding procedure, yielding a success rate of 96% and bandwidth of 0.3 bits per sentence.

[1]  Mikhail J. Atallah,et al.  Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation , 2001, Information Hiding.

[2]  Daniel Jurafsky,et al.  Semantic Role Chunking Combining Complementary Syntactic Views , 2005, CoNLL.

[3]  Eugene Charniak,et al.  A Maximum-Entropy-Inspired Parser , 2000, ANLP.

[4]  Rada Mihalcea,et al.  SenseLearner: Word Sense Disambiguation for All Words in Unrestricted Text , 2005, ACL.

[5]  Michael Collins,et al.  Head-Driven Statistical Models for Natural Language Parsing , 2003, CL.

[6]  Ann Bies,et al.  The Penn Treebank: Annotating Predicate Argument Structure , 1994, HLT.

[7]  Radu Sion,et al.  Natural Language Watermarking and Tamperproofing , 2002, Information Hiding.

[8]  Ralph Grishman,et al.  Comlex Syntax: Building a Computational Lexicon , 1994, COLING.

[9]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[10]  Mark Chapman,et al.  Hiding the Hidden: A software system for concealing ciphertext as innocuous text , 1997, ICICS.

[11]  Lawrence O'Gorman,et al.  Electronic marking and identification techniques to discourage document copying , 1994, Proceedings of INFOCOM '94 Conference on Computer Communications.

[12]  Ted Pedersen,et al.  Maximizing Semantic Relatedness to Perform Word Sense Disambiguation , 2005 .

[13]  Brian Murphy,et al.  Syntactic Information Hiding in Plain Text , 2001 .

[14]  Christian Damsgaard Jensen Fingerprinting Text in Logical Markup Languages , 2001, ISC.

[15]  Walter Bender,et al.  Techniques for data hiding , 1995, Electronic Imaging.

[16]  Peter Wayner Disappearing cryptography - being and nothingness on the net , 1996 .

[17]  Bernd Girod,et al.  Digital watermarking of text, image, and video documents , 1998, Comput. Graph..

[18]  Martha Palmer,et al.  From TreeBank to PropBank , 2002, LREC.

[19]  Andy Way,et al.  Evaluating Automatic LFG F-Structure Annotation for the Penn-II Treebank , 2004 .