Non-Local Sparse Image Inpainting for Document Bleed-Through Removal

Bleed-through is a frequent, pervasive degradation in ancient manuscripts, which is caused by ink seeped from the opposite side of the sheet. Bleed-through, appearing as an extra interfering text, hinders document readability and makes it difficult to decipher the information contents. Digital image restoration techniques have been successfully employed to remove or significantly reduce this distortion. This paper proposes a two-step restoration method for documents affected by bleed-through, exploiting information from the recto and verso images. First, the bleed-through pixels are identified, based on a non-stationary, linear model of the two texts overlapped in the recto-verso pair. In the second step, a dictionary learning-based sparse image inpainting technique, with non-local patch grouping, is used to reconstruct the bleed-through-contaminated image information. An overcomplete sparse dictionary is learned from the bleed-through-free image patches, which is then used to estimate a befitting fill-in for the identified bleed-through pixels. The non-local patch similarity is employed in the sparse reconstruction of each patch, to enforce the local similarity. Thanks to the intrinsic image sparsity and non-local patch similarity, the natural texture of the background is well reproduced in the bleed-through areas, and even a possible overestimation of the bleed through pixels is effectively corrected, so that the original appearance of the document is preserved. We evaluate the performance of the proposed method on the images of a popular database of ancient documents, and the results validate the performance of the proposed method compared to the state of the art.

[1]  Anna Tonazzini,et al.  Digital restoration of ancient color manuscripts from geometrically misaligned recto-verso pairs , 2016 .

[2]  Salvatore Tabbone,et al.  Sparsity-based edge noise removal from bilevel graphical document images , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[3]  Michael Elad,et al.  Double Sparsity: Learning Sparse Dictionaries for Sparse Signal Approximation , 2010, IEEE Transactions on Signal Processing.

[4]  Alexander Wong,et al.  A nonlocal-means approach to exemplar-based inpainting , 2008, 2008 15th IEEE International Conference on Image Processing.

[5]  Anna Tonazzini,et al.  Independent component analysis for document restoration , 2004, Document Analysis and Recognition.

[6]  Bhaskar D. Rao,et al.  Sparse signal reconstruction from limited data using FOCUSS: a re-weighted minimum norm algorithm , 1997, IEEE Trans. Signal Process..

[7]  R. F. Moghaddam,et al.  Low quality document image modeling and enhancement , 2009, International Journal of Document Analysis and Recognition (IJDAR).

[8]  Stéphane Mallat,et al.  Matching pursuits with time-frequency dictionaries , 1993, IEEE Trans. Signal Process..

[9]  Anna Tonazzini,et al.  An inpainting technique based on regularization to remove bleed-through from ancient documents , 2016, 2016 International Workshop on Computational Intelligence for Multimedia Understanding (IWCIM).

[10]  Farnood Merrikh-Bayat,et al.  Linear-quadratic blind source separating structure for removing show-through in scanned documents , 2011, International Journal on Document Analysis and Recognition (IJDAR).

[11]  David J. Field,et al.  Sparse coding with an overcomplete basis set: A strategy employed by V1? , 1997, Vision Research.

[12]  Anil C. Kokaram,et al.  A Non-parametric Framework for Document Bleed-through Removal , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  Joseph F. Murray,et al.  Dictionary Learning Algorithms for Sparse Representation , 2003, Neural Computation.

[14]  Michael Elad,et al.  Compression of facial images using the K-SVD algorithm , 2008, J. Vis. Commun. Image Represent..

[15]  Stephen M. Smith,et al.  SUSAN—A New Approach to Low Level Image Processing , 1997, International Journal of Computer Vision.

[16]  Christine Guillemot,et al.  Image Inpainting : Overview and Recent Advances , 2014, IEEE Signal Processing Magazine.

[17]  Anna Tonazzini,et al.  Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[18]  Joel A. Tropp,et al.  Greed is good: algorithmic results for sparse approximation , 2004, IEEE Transactions on Information Theory.

[19]  A. Tonazzini,et al.  Color space transformations for analysis and enhancement of ancient degraded manuscripts , 2010, Pattern Recognition and Image Analysis.

[20]  Guillermo Sapiro,et al.  Image inpainting , 2000, SIGGRAPH.

[21]  Zongben Xu,et al.  Image Inpainting by Patch Propagation Using Patch Sparsity , 2010, IEEE Transactions on Image Processing.

[22]  Stephen J. Wright,et al.  Computational Methods for Sparse Solution of Linear Inverse Problems , 2010, Proceedings of the IEEE.

[23]  Hamid R. Rabiee,et al.  Spatial-Aware Dictionary Learning for Hyperspectral Image Classification , 2013, IEEE Transactions on Geoscience and Remote Sensing.

[24]  Wen Gao,et al.  Group-Based Sparse Representation for Image Restoration , 2014, IEEE Transactions on Image Processing.

[25]  Anna Tonazzini,et al.  Multichannel Blind Separation and Deconvolution of Images for Document Analysis , 2010, IEEE Transactions on Image Processing.

[26]  Alexandru Telea,et al.  An Image Inpainting Technique Based on the Fast Marching Method , 2004, J. Graphics, GPU, & Game Tools.

[27]  Kjersti Engan,et al.  Method of optimal directions for frame design , 1999, 1999 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings. ICASSP99 (Cat. No.99CH36258).

[28]  Xavier Bresson,et al.  Bregmanized Nonlocal Regularization for Deconvolution and Sparse Reconstruction , 2010, SIAM J. Imaging Sci..

[29]  Lei Zhang,et al.  Nonlocally Centralized Sparse Representation for Image Restoration , 2013, IEEE Transactions on Image Processing.

[30]  Yoram Bresler,et al.  MR Image Reconstruction From Highly Undersampled k-Space Data by Dictionary Learning , 2011, IEEE Transactions on Medical Imaging.

[31]  Xavier Bresson,et al.  Nonlocal Mumford-Shah Regularizers for Color Image Restoration , 2011, IEEE Transactions on Image Processing.

[32]  Pascal Frossard,et al.  Dictionary learning: What is the right representation for my signal? , 2011 .

[33]  M. Elad,et al.  $rm K$-SVD: An Algorithm for Designing Overcomplete Dictionaries for Sparse Representation , 2006, IEEE Transactions on Signal Processing.

[34]  C. V. Jawahar,et al.  Sparse Document Image Coding for Restoration , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[35]  Pascal Frossard,et al.  Dictionary Learning , 2011, IEEE Signal Processing Magazine.

[36]  Michael S. Brown,et al.  User-Assisted Ink-Bleed Reduction , 2010, IEEE Transactions on Image Processing.

[37]  Michael Elad,et al.  Sparse Representation for Color Image Restoration , 2008, IEEE Transactions on Image Processing.

[38]  Guillermo Sapiro,et al.  Navier-stokes, fluid dynamics, and image and video inpainting , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[39]  Mohamed Cheriet,et al.  A Variational Approach to Degraded Document Enhancement , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[40]  Larry S. Davis,et al.  Label Consistent K-SVD: Learning a Discriminative Dictionary for Recognition , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[41]  Michael A. Saunders,et al.  Atomic Decomposition by Basis Pursuit , 1998, SIAM J. Sci. Comput..

[42]  Carlo Tomasi,et al.  Manuscript Bleed-through Removal via Hysteresis Thresholding , 2009, 2009 10th International Conference on Document Analysis and Recognition.

[43]  Jean-Michel Morel,et al.  A non-local algorithm for image denoising , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[44]  Anna Tonazzini,et al.  A Markov model for blind image separation by a mean-field EM algorithm , 2006, IEEE Transactions on Image Processing.

[45]  Anna Tonazzini,et al.  A non-stationary density model to separate overlapped texts in degraded documents , 2015, Signal Image Video Process..

[46]  David Tschumperlé,et al.  Fast Anisotropic Smoothing of Multi-Valued Images using Curvature-Preserving PDE's , 2006, International Journal of Computer Vision.

[47]  Eric Dubois,et al.  Joint Compression and Restoration of Documents with Bleed-through , 2005 .

[48]  Venu Govindaraju,et al.  Historical document image enhancement using background light intensity normalization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[49]  Guillermo Sapiro,et al.  Simultaneous structure and texture image inpainting , 2003, IEEE Trans. Image Process..

[50]  Miki Haseyama,et al.  Image inpainting based on sparse representations with a perceptual metric , 2013, EURASIP J. Adv. Signal Process..

[51]  Chew Lim Tan,et al.  Restoration of Archival Documents Using a Wavelet Technique , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[52]  Frank Lebourgeois,et al.  Restoring Ink Bleed-Through Degraded Document Images Using a Recursive Unsupervised Classification Technique , 2006, Document Analysis Systems.

[53]  P. Macháin Irish Script on Screen: the Growth and Development ofa Manuscript Digitisation Project , 2011 .

[54]  Rong Zhang,et al.  SAR Image Compression Using Multiscale Dictionary Learning and Sparse Representation , 2013, IEEE Geoscience and Remote Sensing Letters.

[55]  Adel M. Alimi,et al.  Joint denoising and magnification of noisy Low-Resolution textual images , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[56]  Allen Y. Yang,et al.  Robust Face Recognition via Sparse Representation , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[57]  Christian Wolf,et al.  Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field , 2010, IEEE Trans. Pattern Anal. Mach. Intell..

[58]  Michael Elad,et al.  Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries , 2006, IEEE Transactions on Image Processing.

[59]  Anil C. Kokaram,et al.  A Ground Truth Bleed-Through Document Image Database , 2012, TPDL.

[60]  Patrick Pérez,et al.  Region filling and object removal by exemplar-based image inpainting , 2004, IEEE Transactions on Image Processing.

[61]  Wei Hu,et al.  Image inpainting via sparse representation , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[62]  Muhammad Hanif,et al.  Maximum likelihood orthogonaldictionary learning , 2014, 2014 IEEE Workshop on Statistical Signal Processing (SSP).

[63]  A. Bruckstein,et al.  K-SVD : An Algorithm for Designing of Overcomplete Dictionaries for Sparse Representation , 2005 .

[64]  Xiao-Ping Zhang,et al.  Blind Bleed-Through Removal for Scanned Historical Document Image With Conditional Random Fields , 2015, IEEE Transactions on Image Processing.