Restoration of recto-verso archival documents through a regularized nonlinear model

We approach the removal of back-to-front interferences from recto and verso scans of archival documents as a blind source separation problem, considering the front and back ideal images as two individual patterns that overlap in the observed scans through some mixing operator. The nonlinear mixing model and the related restoration algorithm proposed in [1] are efficient for modern documents affected by mild show-through, but are not fully adequate to cope with ancient documents often degraded by the heavier and non-stationary bleed-through distortion. We then propose to modify this data model to account for non-stationarity of the degradation, and resort to the genuine concept of source separation for deriving the restoration algorithm. Within a regularization approach, we joint estimate the ideal images and the model parameters, by minimizing an energy function of all the unknowns, accounting also for local autocorrelation of the the ideal images. We derive a fully deterministic algorithm that is computationally efficient, and analyze its performance against documents heavily degraded by either show-through or bleed-through.

[1]  Chew Lim Tan,et al.  Restoration of Archival Documents Using a Wavelet Technique , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Chew Lim Tan,et al.  Matching of double-sided document images to remove interference , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[3]  Andrew Blake,et al.  Visual Reconstruction , 1987, Deep Learning for EEG-Based Brain–Computer Interfaces.

[4]  Gaurav Sharma,et al.  Show-through cancellation in scans of duplex printed documents , 2001, IEEE Trans. Image Process..

[5]  Eric Dubois,et al.  Reduction of Bleed-through in Scanned Manuscript Documents , 2001, PICS.

[6]  Anna Tonazzini,et al.  Registration and Enhancement of Double-Sided Degraded Manuscripts Acquired in Multispectral Modality , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[7]  Anna Tonazzini,et al.  Multichannel Blind Separation and Deconvolution of Images for Document Analysis , 2010, IEEE Transactions on Image Processing.

[8]  Farnood Merrikh-Bayat,et al.  Using Non-Negative Matrix Factorization for Removing Show-Through , 2010, LVA/ICA.

[9]  Anna Tonazzini,et al.  Independent component analysis for document restoration , 2004, Document Analysis and Recognition.

[10]  R. F. Moghaddam,et al.  Low quality document image modeling and enhancement , 2009, International Journal of Document Analysis and Recognition (IJDAR).

[11]  Anna Tonazzini,et al.  A Deterministic Algorithm for Reconstructing Images with Interacting Discontinuities , 1994, CVGIP Graph. Model. Image Process..

[12]  Boaz Ophir,et al.  Show-Through Cancellation in Scanned Images using Blind Source Separation Techniques , 2007, 2007 IEEE International Conference on Image Processing.

[13]  Anna Tonazzini,et al.  Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique , 2007, International Journal of Document Analysis and Recognition (IJDAR).