Sparse Representation Based Inpainting for the Restoration of Document Images Affected by Bleed-Through

Bleed-through is a commonly encountered degradation in ancient printed documents and manuscripts, which severely impair their readability. Digital image restoration techniques can be effective to remove or significantly reduce this degradation. In bleed-through document image restoration the main issue is to identify the bleed-through pixels and replace them with appropriate values, in accordance to their surroundings. In this paper, we propose a two stage method, where a pair of properly registered images of the document recto and verso is first used to locate the bleed-through pixels in each side, and then a sparse representation based image inpainting technique is used to fill-in the bleed-through areas according to the neighbourhood, in such a way to preserve the original appearance of the document. The advantages of the proposed inpainting technique over state-of-the-art methods are illustrated by the improvement in the visual results.

[1]  Michael S. Brown,et al.  User-Assisted Ink-Bleed Reduction , 2010, IEEE Transactions on Image Processing.

[2]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[3]  Carlo Tomasi,et al.  Manuscript Bleed-through Removal via Hysteresis Thresholding , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[4]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[5]  Anna Tonazzini,et al.  Independent component analysis for document restoration , 2004, Document Analysis and Recognition.

[6]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[7]  Michael Elad,et al.  On the Role of Sparse and Redundant Representations in Image Processing , 2010, Proceedings of the IEEE.

[8]  Christian Wolf,et al.  Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field , 2010, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[10]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[11]  Wen Gao,et al.  Group-Based Sparse Representation for Image Restoration , 2014, IEEE Transactions on Image Processing.

[12]  Anna Tonazzini,et al.  A non-stationary density model to separate overlapped texts in degraded documents , 2015, Signal Image Video Process..

[13]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[14]  Frank Lebourgeois,et al.  Restoring Ink Bleed-Through Degraded Document Images Using a Recursive Unsupervised Classification Technique , 2006, Document Analysis Systems.

[15]  Anil C. Kokaram,et al.  A Ground Truth Bleed-Through Document Image Database , 2012, TPDL.

[16]  Mohamed Cheriet,et al.  A Variational Approach to Degraded Document Enhancement , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[17]  Anna Tonazzini,et al.  Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[18]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[19]  Stephen J. Wright,et al.  Computational Methods for Sparse Solution of Linear Inverse Problems , 2010, Proceedings of the IEEE.

[20]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[21]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[22]  Xiao-Ping Zhang,et al.  Blind Bleed-Through Removal for Scanned Historical Document Image With Conditional Random Fields , 2015, IEEE Transactions on Image Processing.

[23]  Anna Tonazzini,et al.  An inpainting technique based on regularization to remove bleed-through from ancient documents , 2016, 2016 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM).

[24]  Anil C. Kokaram,et al.  A Non-parametric Framework for Document Bleed-through Removal , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Wufan Chen,et al.  Adaptive Denoising by Singular Value Decomposition , 2011, IEEE Signal Processing Letters.

[26]  Terrence J. Sejnowski,et al.  Learning Overcomplete Representations , 2000, Neural Computation.