论文信息 - Natural Language Watermarking and Robust Hashing Based on Presuppositional Analysis

Natural Language Watermarking and Robust Hashing Based on Presuppositional Analysis

We propose a method of text watermarking and hashing based on natural-language semantic structures. In particular, we are interested in the linguistic semantic phenomenon of presupposition. Presupposition is implicit information that is taken for granted by the reader and establishes common ground between the author's and reader's situational knowledge; it is a semantic component of certain linguistic expressions (lexical items and syntactic constructions called presupposition triggers). The same sentence can be used with or without presupposition, provided that all the relations between discourse referents are preserved. The number of presuppositions in randomly grouped sentences and the web of resolved presupposed information in the text holds the watermark (e.g. integrity watermark, or prove of ownership), introducing "secret ordering" into the text structure to make it resilient to a certain amount of data altering attacks. This intrinsic structure of the text can be also used as a robust hash of the text.

Benoit M. Macq | Olga Vybornova | B. Macq | O. Vybornova

[1] Benoit M. Macq,et al. A method of text watermarking using presuppositions , 2007, Electronic Imaging.

[2] Radu Sion,et al. Rights Protection for Relational Data , 2004, IEEE Trans. Knowl. Data Eng..

[3] Radu Sion,et al. Natural Language Watermarking and Tamperproofing , 2002, Information Hiding.

[4] Rob A. van der Sandt,et al. Presupposition Projection as Anaphora Resolution , 1992, J. Semant..

[5] Mikhail J. Atallah,et al. Natural Language Watermarking: Design, Analysis, and a Proof-of-Concept Implementation , 2001, Information Hiding.

[6] J. Spenader,et al. Presupposition or Abstract Object Anaphora ? : Constraints on Choice of Factive Complements in Spoken Discourse , 2001 .

[7] Krista Bennett,et al. LINGUISTIC STEGANOGRAPHY: SURVEY, ANALYSIS, AND ROBUSTNESS CONCERNS FOR HIDING INFORMATION IN TEXT , 2004 .

[8] Radu Sion,et al. Rights protection for relational data , 2003, IEEE Transactions on Knowledge and Data Engineering.

[9] Johan Bos. Towards Wide-Coverage Semantic Interpretation , 2005 .

[10] Dilek Z. Hakkani-Tür,et al. Natural language watermarking: challenges in building a practical system , 2006, Electronic Imaging.

[11] James R. Curran,et al. Parsing the WSJ Using CCG and Log-Linear Models , 2004, ACL.