论文信息 - Document rectification and illumination correction using a patch-based CNN

Document rectification and illumination correction using a patch-based CNN

We propose a novel learning method to rectify document images with various distortion types from a single input image. As opposed to previous learning-based methods, our approach seeks to first learn the distortion flow on input image patches rather than the entire image. We then present a robust technique to stitch the patch results into the rectified document by processing in the gradient domain. Furthermore, we propose a second network to correct the uneven illumination, further improving the readability and OCR accuracy. Due to the less complex distortion present on the smaller image patches, our patch-based approach followed by stitching and illumination correction can significantly improve the overall accuracy in both the synthetic and real datasets.

[1] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[2] Christoph H. Lampert,et al. Document capture using stereo vision , 2004, DocEng '04.

[3] Thomas S. Huang,et al. Image processing , 1971 .

[4] Pierre Baylou,et al. Active contours network to straighten distorted text lines , 2001, Proceedings 2001 International Conference on Image Processing (Cat. No.01CH37205).

[5] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Michael S. Brown,et al. Multi-View Document Rectification using Boundary , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[7] Yuandong Tian,et al. Rectification and 3D reconstruction of curved document images , 2011, CVPR 2011.

[8] Katsushi Ikeuchi,et al. Multiview Rectification of Folded Documents , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9] Yu Zhang,et al. Ieee Transactions on Pattern Analysis and Machine Intelligence 1 an Improved Physically-based Method for Geometric Restoration of Distorted Document Images , 2007 .

[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11] Lei Yang,et al. Image-based bidirectional scene reprojection , 2011, ACM Trans. Graph..

[12] Nanning Zheng,et al. Nonparametric Illumination Correction for Scanned Document Images via Convex Hulls , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Sagnik Das,et al. The Common Fold: Utilizing the Four-Fold to Dewarp Printed Documents from a Single Image , 2017, DocEng.

[14] Gady Agam,et al. Document Image De-warping for Text/Graphics Recognition , 2002, SSPR/SPR.

[15] Frédo Durand,et al. Transform recipes for efficient cloud photo enhancement , 2015, ACM Trans. Graph..

[16] E.E. Pissaloux,et al. Image Processing , 1994, Proceedings. Second Euromicro Workshop on Parallel and Distributed Processing.

[17] Nam Ik Cho,et al. State Estimation in a Document Image and Its Application in Text Block Identification and Text Line Extraction , 2010, ECCV.

[18] Alexei A. Efros,et al. Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19] Andrew W. Fitzgibbon,et al. Bundle Adjustment - A Modern Synthesis , 1999, Workshop on Vision Algorithms.

[20] Vineet Gandhi,et al. An Iterative Approach for Shadow Removal in Document Images , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[21] Pedro V. Sander,et al. Gigapixel Panorama Video Loops , 2017, ACM Trans. Graph..

[22] Matti Pietikäinen,et al. Adaptive document image binarization , 2000, Pattern Recognit..

[23] Atsushi Yamashita,et al. Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[24] R. Smith,et al. An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[25] Takashi Matsuyama,et al. Shape from Shading with Interreflections Under a Proximal Light Source: Distortion-Free Copying of an Unfolded Book , 1997, International Journal of Computer Vision.

[26] Rafael Dueire Lins,et al. A New Method for Shading Removal and Binarization of Documents Acquired with Portable Digital Cameras , 2010 .

[27] Hiroshi Ishikawa,et al. Let there be color! , 2016, ACM Trans. Graph..

[28] Chew Lim Tan,et al. Restoring Warped Document Images through 3D Shape Modeling , 2006, IEEE Trans. Pattern Anal. Mach. Intell..

[29] Li Fei-Fei,et al. Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[30] D. Doermann,et al. Unwarping Images of Curved Documents Using Global Shape Optimization , 2005 .

[31] David S. Doermann,et al. Geometric Rectification of Camera-Captured Document Images , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[32] Wolfram Luther,et al. Document Image De-warping Based on Detection of Distorted Text Lines , 2005, ICIAP.

[33] Nam Ik Cho,et al. Composition of a Dewarped and Enhanced Document Image From Two View Images , 2009, IEEE Transactions on Image Processing.

[34] Pedro V. Sander,et al. Blind Geometric Distortion Correction on Images Through Deep Learning , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[35] Ali Zandifar. Unwarping scanned image of Japanese/English documents , 2007, 14th International Conference on Image Analysis and Processing (ICIAP 2007).

[36] Pierre Gurdjos,et al. Shape from shading for the digitization of curved documents , 2007, Machine Vision and Applications.

[37] Endong Wang,et al. Intel Math Kernel Library , 2014 .

[38] Dimitris Samaras,et al. DocUNet: Document Image Unwarping via a Stacked U-Net , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[39] Changsong Liu,et al. Rectifying the bound document image captured by the camera: a model based approach , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[40] Michael S. Brown,et al. Geometric and shading correction for images of printed materials using boundary , 2006, IEEE Transactions on Image Processing.

[41] Michael S. Brown,et al. A unified framework for document restoration using inpainting and shape-from-shading , 2009, Pattern Recognit..

[42] Nam Ik Cho,et al. Robust Document Image Dewarping Method Using Text-Lines and Line Segments , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[43] W. Brent Seales,et al. Document restoration using 3D shape: a general deskewing algorithm for arbitrarily warped documents , 2001, Proceedings Eighth IEEE International Conference on Computer Vision. ICCV 2001.

[44] Ying Wu,et al. Exploiting Vector Fields for Geometric Rectification of Distorted Document Images , 2018, ECCV.

[45] Yuan He,et al. A Book Dewarping System by Boundary-Based 3D Surface Reconstruction , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[46] Chew Lim Tan,et al. Restoration of curved document images through 3D shape modeling , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[47] Gaofeng Meng,et al. Active Flattening of Curved Document Images via Two Structured Beams , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[48] Kalyan Sunkavalli,et al. Removing Shadows from Images of Documents , 2016, ACCV.

[49] Vladimir I. Levenshtein,et al. Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[50] Nam Ik Cho,et al. Document dewarping via text-line based optimization , 2015, Pattern Recognit..

[51] Atsushi Yamashita,et al. Shape reconstruction and image restoration for non-flat surfaces of documents with a stereo vision system , 2004, ICPR 2004.