User-Assisted Ink-Bleed Reduction

This paper presents a novel user-assisted approach to reduce ink-bleed interference found in old manuscripts. The problem is addressed by first having the user provide simple examples of foreground ink, ink-bleed, and the manuscript's background. From this small amount of user-labeled data, likelihoods of each pixel being foreground, ink-bleed, or background are computed and used as the data costs of a dual-layer Markov random field (MRF) that simultaneously labels all pixels in both the front and back sides of the manuscript. This user-assisted approach produces better results than existing algorithms without the need for extensive parameter tuning or prior assumptions about the ink-bleed intensity characteristics. Our overall application framework is discussed along with details of the features used in the data costs, a comparison between K-nearest neighbor and support vector machine for likelihood estimation, the dual-layer MRF formulation with associated inter- and intra-layer costs, and a comparison of our approach against other ink-bleed reduction algorithms.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Richard Szeliski,et al.  A Comparative Study of Energy Minimization Methods for Markov Random Fields , 2006, ECCV.

[3]  Michael S. Brown,et al.  Geometric and shading correction for images of printed materials using boundary , 2006, IEEE Transactions on Image Processing.

[4]  Maurizio Pilu Undoing paper curl distortion using applicable surfaces , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[5]  Venu Govindaraju,et al.  Historical document image enhancement using background light intensity normalization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[6]  Chew Lim Tan,et al.  Restoration of Archival Documents Using a Wavelet Technique , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Christian Wolf,et al.  Document Ink Bleed-Through Removal with Two Hidden Markov Random Fields and a Single Observation Field , 2010, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Chew Lim Tan,et al.  Document image enhancement using directional wavelet , 2003, 2003 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2003. Proceedings..

[9]  Maurizio Pilu,et al.  Undoing page curl distortion using applicable surfaces , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[10]  Chew Lim Tan,et al.  Image Enhancement of Historical Documents Using Directional Wavelet , 2003, Int. J. Wavelets Multiresolution Inf. Process..

[11]  Gaurav Sharma,et al.  Show-through cancellation in scans of duplex printed documents , 2001, IEEE Trans. Image Process..

[12]  Anna Tonazzini,et al.  A Markov model for blind image separation by a mean-field EM algorithm , 2006, IEEE Transactions on Image Processing.

[13]  W. Brent Seales,et al.  Image restoration of arbitrarily warped documents , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Jack C. Lee,et al.  Toward on-line, worldwide access to Vatican Library materials , 1996, IBM J. Res. Dev..

[15]  Stan Z. Li,et al.  Markov Random Field Modeling in Image Analysis , 2001, Computer Science Workbench.

[16]  Frank Lebourgeois,et al.  Restoring Ink Bleed-Through Degraded Document Images Using a Recursive Unsupervised Classification Technique , 2006, Document Analysis Systems.

[17]  Vladimir Kolmogorov,et al.  An experimental comparison of min-cut/max- flow algorithms for energy minimization in vision , 2001, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Anna Tonazzini,et al.  Fast correction of bleed-through distortion in grayscale documents by a blind source separation technique , 2007, International Journal of Document Analysis and Recognition (IJDAR).

[19]  Olga Veksler,et al.  Fast approximate energy minimization via graph cuts , 2001, Proceedings of the Seventh IEEE International Conference on Computer Vision.

[20]  Michael S. Brown,et al.  User-assisted ink-bleed correction for handwritten documents , 2008, JCDL '08.

[21]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[22]  Vladimir Kolmogorov,et al.  What energy functions can be minimized via graph cuts? , 2002, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  Fred L. Bookstein,et al.  Principal Warps: Thin-Plate Splines and the Decomposition of Deformations , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Michael S. Brown,et al.  A framework for reducing ink-bleed in old documents , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[25]  Chew Lim Tan,et al.  Restoring Warped Document Images through 3D Shape Modeling , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Anna Tonazzini,et al.  Independent component analysis for document restoration , 2004, Document Analysis and Recognition.