Correcting geometric and photometric distortion of document images on a smartphone

Abstract. A set of document image processing algorithms for improving the optical character recognition (OCR) capability of smartphone applications is presented. The scope of the problem covers the geometric and photometric distortion correction of document images. The proposed framework was developed to satisfy industrial requirements. It is implemented on an off-the-shelf smartphone with limited resources in terms of speed and memory. Geometric distortions, i.e., skew and perspective distortion, are corrected by sending horizontal and vertical vanishing points toward infinity in a downsampled image. Photometric distortion includes image degradation from moiré pattern noise and specular highlights. Moiré pattern noise is removed using low-pass filters with different sizes independently applied to the background and text region. The contrast of the text in a specular highlighted area is enhanced by locally enlarging the intensity difference between the background and text while the noise is suppressed. Intensive experiments indicate that the proposed methods show a consistent and robust performance on a smartphone with a runtime of less than 1 s.

[1]  W. B. Seales,et al.  Restoring 2D Content from Distorted Documents , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Serene Banerjee,et al.  Enhanced bleed through removal for scanned document images , 2011, Electronic Imaging.

[3]  Michael S. Brown,et al.  Multi-View Document Rectification using Boundary , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[4]  Alex ChiChung Kot,et al.  Identification of recaptured photographs on LCD screens , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.

[5]  Yuandong Tian,et al.  Rectification and 3D reconstruction of curved document images , 2011, CVPR 2011.

[6]  Jongmin Kim,et al.  A new moire smoothing method for color inverse halftoning , 2002, Proceedings. International Conference on Image Processing.

[7]  Richard O. Duda,et al.  Use of the Hough transformation to detect lines and curves in pictures , 1972, CACM.

[8]  Soo-Chang Pei,et al.  Enhancement of uneven lighting text image using line-based Empirical Mode Decomposition , 2011, 2011 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[9]  Venu Govindaraju,et al.  Historical document image enhancement using background light intensity normalization , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[10]  David S. Doermann,et al.  Geometric Rectification of Camera-Captured Document Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  S. H. Kim,et al.  Efficient skew estimation and correction algorithm for document images , 2002, Image Vis. Comput..

[12]  Chew Lim Tan,et al.  Restoring Warped Document Images through 3D Shape Modeling , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[13]  Gaofeng Meng,et al.  Metric Rectification of Curved Document Images , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Rik Van de Walle,et al.  Suppression of sampling moire in color printing by spline-based least-squares prefiltering , 2003, Pattern Recognit. Lett..

[15]  Roberto Manduchi,et al.  Bilateral filtering for gray and color images , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[16]  Michael S. Brown,et al.  Geometric and shading correction for images of printed materials using boundary , 2006, IEEE Transactions on Image Processing.

[17]  Charles A. Bouman,et al.  Training-based algorithm for Moire suppression in scanned halftone images , 2007, Electronic Imaging.

[18]  Jung-Young Son,et al.  About a MoirÉ-Less Condition for Non-Square Grids , 2008, Journal of Display Technology.

[19]  Pier Luigi Dragotti,et al.  An investigation into aliasing in images recaptured from an LCD monitor using a digital camera , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Satoshi Naoi,et al.  Robust Vanishing Point Detection for MobileCam-Based Documents , 2011, 2011 International Conference on Document Analysis and Recognition.

[21]  Olli Nevalainen,et al.  A Standalone OCR System for Mobile Cameraphones , 2006, 2006 IEEE 17th International Symposium on Personal, Indoor and Mobile Radio Communications.

[22]  Shijian Lu,et al.  Perspective rectification of document images using fuzzy set and morphological operations , 2005, Image Vis. Comput..

[23]  Yu Zhang,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 an Improved Physically-based Method for Geometric Restoration of Distorted Document Images , 2007 .

[24]  Maurizio Pilu,et al.  Extraction of illusory linear clues in perspectively skewed documents , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[25]  Yang Cao,et al.  Skew detection and correction in document images bsed on straight-line fitting , 2003, Pattern Recognit. Lett..

[26]  Serene Banerjee,et al.  Real-time embedded skew detection and frame removal , 2010, 2010 IEEE International Conference on Image Processing.

[27]  Maurizio Pilu,et al.  A light-weight text image processing method for handheld embedded cameras , 2002, BMVC.

[28]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Derek Bradley,et al.  Adaptive Thresholding using the Integral Image , 2007, J. Graph. Tools.

[30]  Steve Holden,et al.  Sequential correction of perspective warp in camera-based documents , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[31]  Nam Ik Cho,et al.  2009 10th International Conference on Document Analysis and Recognition Feature Based Binarization of Document Images Degraded by Uneven Light Condition , 2022 .

[32]  Seyed Amin Tabatabaei,et al.  A novel method for binarization of badly illuminated document images , 2010, 2010 IEEE International Conference on Image Processing.

[33]  Shijian Lu,et al.  A partition approach for the restoration of camera images of planar and curled document , 2006, Image Vis. Comput..

[34]  Majid Mirmehdi,et al.  Finding Text Regions using Localised Statistical Measures , 2000, British Machine Vision Conference.

[35]  Majid Mirmehdi,et al.  Rectifying perspective views of text in 3D scenes using vanishing points , 2003, Pattern Recognit..