Document Image Dewarping using Deep Learning

The distorted images have been a major problem for Optical Character Recognition (OCR). In order to perform OCR on distorted images, dewarping has become a principal preprocessing step. This paper presents a new document dewarping method that removes curl and geometric distortion of modern and historical documents. Finally, the proposed method is evaluated and compared to the existing Computer Vision based method. Most of the traditional dewarping algorithms are created based on the text line feature extraction and segmentation. However, textual content extraction and segmentation can be sophisticated. Hence, the new technique is proposed, which doesn’t need any complicated methods to process the text lines. The proposed method is based on Deep Learning and it can be applied on all type of text documents and also documents with images and graphics. Moreover, there is no preprocessing required to apply this method on warped images. In the proposed system, the document distortion problem is treated as an image-to-image translation. The new method is implemented using a very powerful pix2pixhd network by utilizing Conditional Generative Adversarial Networks (CGAN). The network is trained on UW3 dataset by supplying distorted document as an input and cleaned image as the target. The generated images from the proposed method are cleanly dewarped and they are of high-resolution. Furthermore, these images can be used to perform OCR.

[1]  David S. Doermann,et al.  Camera-based analysis of text and documents: a survey , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[2]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[3]  Syed Saqib Bukhari,et al.  Dewarping of Document Images using Coupled-Snakes , 2009 .

[4]  Hiromitsu Yamada,et al.  Optical Character Recognition , 1999 .

[5]  Sebastian Bosse,et al.  A Haar wavelet-based perceptual similarity index for image quality assessment , 2016, Signal Process. Image Commun..

[6]  Gady Agam,et al.  Document Image De-warping for Text/Graphics Recognition , 2002, SSPR/SPR.

[7]  Nam Ik Cho,et al.  Composition of a Dewarped and Enhanced Document Image From Two View Images , 2009, IEEE Transactions on Image Processing.

[8]  Christoph H. Lampert,et al.  Document image dewarping using robust estimation of curled text lines , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[9]  Ioannis Pratikakis,et al.  A Two-Step Dewarping of Camera Document Images , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[10]  Eero P. Simoncelli,et al.  Image quality assessment: from error visibility to structural similarity , 2004, IEEE Transactions on Image Processing.

[11]  Jan Kautz,et al.  High-Resolution Image Synthesis and Semantic Manipulation with Conditional GANs , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[12]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[13]  Nam Ik Cho,et al.  Document dewarping via text-line based optimization , 2015, Pattern Recognit..

[14]  Syed Saqib Bukhari,et al.  Robust Binarization of Stereo and Monocular Document Images Using Percentile Filter , 2013, CBDAR.