Marking text features of document images to deter illicit dissemination

A major impediment to the widespread adoption of services for electronic distribution of copyrighted material is the ease with which illicit copies can be made and disseminated. In this paper, we describe feature coding techniques to mark document images with codes that are indiscernible by readers but can be decoded by document feature analysis techniques. These codes can be used to trace the source of errant documents. We propose three coding methods: line-shift coding, word-shift coding, and character coding. We describe how the coding features are found, inserted, and decoded. Finally, we show preliminary experimental results indicating robustness of the line-shift coding method where decoding can be performed even from images of up to ten generations of photocopying.