论文信息 - Blind Source Separation Techniques for Detecting Hidden Texts and Textures in Document Images

Blind Source Separation Techniques for Detecting Hidden Texts and Textures in Document Images

Blind Source Separation techniques, based both on Independent Component Analysis and on second order statistics, are presented and compared for extracting partially hidden texts and textures in document images. Barely perceivable features may occur, for instance, in ancient documents previously erased and then re-written (palimpsests), or for transparency or seeping of ink from the reverse side, or from watermarks in the paper. Detecting these features can be of great importance to scholars and historians. In our approach, the document is modeled as the superposition of a number of source patterns, and a simplified linear mixture model is introduced for describing the relationship between these sources and multispectral views of the document itself. The problem of detecting the patterns that are barely perceivable in the visible color image is thus formulated as the one of separating the various patterns in the mixtures. Some examples from an extensive experimentation with real ancient documents are shown and commented.

[1] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[2] D. Signorini,et al. Neural networks , 1995, The Lancet.

[3] Gaurav Sharma,et al. Show-through cancellation in scans of duplex printed documents , 2001, IEEE Trans. Image Process..

[4] Andrzej Cichocki,et al. Adaptive blind signal and image processing , 2002 .

[5] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.

[6] M. V. Rossum,et al. In Neural Computation , 2022 .