Document rectification and illumination correction using a patch-based CNN

We propose a novel learning method to rectify document images with various distortion types from a single input image. As opposed to previous learning-based methods, our approach seeks to first learn the distortion flow on input image patches rather than the entire image. We then present a robust technique to stitch the patch results into the rectified document by processing in the gradient domain. Furthermore, we propose a second network to correct the uneven illumination, further improving the readability and OCR accuracy. Due to the less complex distortion present on the smaller image patches, our patch-based approach followed by stitching and illumination correction can significantly improve the overall accuracy in both the synthetic and real datasets.

[1]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2]  Christoph H. Lampert,et al.  Document capture using stereo vision , 2004, DocEng '04.

[3]  Thomas S. Huang,et al.  Image processing , 1971 .

[4]  Pierre Baylou,et al.  Active contours network to straighten distorted text lines , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[5]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Michael S. Brown,et al.  Multi-View Document Rectification using Boundary , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Yuandong Tian,et al.  Rectification and 3D reconstruction of curved document images , 2011, CVPR 2011.

[8]  Katsushi Ikeuchi,et al.  Multiview Rectification of Folded Documents , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  Yu Zhang,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence 1 an Improved Physically-based Method for Geometric Restoration of Distorted Document Images , 2007 .

[10]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11]  Lei Yang,et al.  Image-based bidirectional scene reprojection , 2011, ACM Trans. Graph..

[12]  Nanning Zheng,et al.  Nonparametric Illumination Correction for Scanned Document Images via Convex Hulls , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  Sagnik Das,et al.  The Common Fold: Utilizing the Four-Fold to Dewarp Printed Documents from a Single Image , 2017, DocEng.

[14]  Gady Agam,et al.  Document Image De-warping for Text/Graphics Recognition , 2002, SSPR/SPR.

[15]  Frédo Durand,et al.  Transform recipes for efficient cloud photo enhancement , 2015, ACM Trans. Graph..

[16]  E.E. Pissaloux,et al.  Image Processing , 1994, Proceedings. Second Euromicro Workshop on Parallel and Distributed Processing.

[17]  Nam Ik Cho,et al.  State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction , 2010, ECCV.

[18]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Andrew W. Fitzgibbon,et al.  Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[20]  Vineet Gandhi,et al.  An Iterative Approach for Shadow Removal in Document Images , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21]  Pedro V. Sander,et al.  Gigapixel Panorama Video Loops , 2017, ACM Trans. Graph..

[22]  Matti Pietikäinen,et al.  Adaptive document image binarization , 2000, Pattern Recognit..

[23]  Atsushi Yamashita,et al.  Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[24]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[25]  Takashi Matsuyama,et al.  Shape from Shading with Interreflections Under a Proximal Light Source: Distortion-Free Copying of an Unfolded Book , 1997, International Journal of Computer Vision.

[26]  Rafael Dueire Lins,et al.  A New Method for Shading Removal and Binarization of Documents Acquired with Portable Digital Cameras , 2010 .

[27]  Hiroshi Ishikawa,et al.  Let there be color! , 2016, ACM Trans. Graph..

[28]  Chew Lim Tan,et al.  Restoring Warped Document Images through 3D Shape Modeling , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[29]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[30]  D. Doermann,et al.  Unwarping Images of Curved Documents Using Global Shape Optimization , 2005 .

[31]  David S. Doermann,et al.  Geometric Rectification of Camera-Captured Document Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32]  Wolfram Luther,et al.  Document Image De-warping Based on Detection of Distorted Text Lines , 2005, ICIAP.

[33]  Nam Ik Cho,et al.  Composition of a Dewarped and Enhanced Document Image From Two View Images , 2009, IEEE Transactions on Image Processing.

[34]  Pedro V. Sander,et al.  Blind Geometric Distortion Correction on Images Through Deep Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35]  Ali Zandifar Unwarping scanned image of Japanese/English documents , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[36]  Pierre Gurdjos,et al.  Shape from shading for the digitization of curved documents , 2007, Machine Vision and Applications.

[37]  Endong Wang,et al.  Intel Math Kernel Library , 2014 .

[38]  Dimitris Samaras,et al.  DocUNet: Document Image Unwarping via a Stacked U-Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39]  Changsong Liu,et al.  Rectifying the bound document image captured by the camera: a model based approach , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[40]  Michael S. Brown,et al.  Geometric and shading correction for images of printed materials using boundary , 2006, IEEE Transactions on Image Processing.

[41]  Michael S. Brown,et al.  A unified framework for document restoration using inpainting and shape-from-shading , 2009, Pattern Recognit..

[42]  Nam Ik Cho,et al.  Robust Document Image Dewarping Method Using Text-Lines and Line Segments , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[43]  W. Brent Seales,et al.  Document restoration using 3D shape: a general deskewing algorithm for arbitrarily warped documents , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[44]  Ying Wu,et al.  Exploiting Vector Fields for Geometric Rectification of Distorted Document Images , 2018, ECCV.

[45]  Yuan He,et al.  A Book Dewarping System by Boundary-Based 3D Surface Reconstruction , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[46]  Chew Lim Tan,et al.  Restoration of curved document images through 3D shape modeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[47]  Gaofeng Meng,et al.  Active Flattening of Curved Document Images via Two Structured Beams , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48]  Kalyan Sunkavalli,et al.  Removing Shadows from Images of Documents , 2016, ACCV.

[49]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[50]  Nam Ik Cho,et al.  Document dewarping via text-line based optimization , 2015, Pattern Recognit..

[51]  Atsushi Yamashita,et al.  Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system , 2004, ICPR 2004.