The Application of Deep Convolutional Denoising Autoencoder for Optical Character Recognition Preprocessing

The process of converting physical documents into digital texts generally requires a scanner tool to obtain high-quality document images. These high quality images will be read by OCR software to get digital text results. The weakness of this method is that OCR software requires a high quality document with low blur noise and no parallax in the image to have high accuracy. We developed an application to increase the document image quality with the help of Deep Convolutional Denoising Autoencoder, afterwards read by the OCR application. The final product of this program is a digital text converted from a document image which has been taken from a smartphone. There is an increase in accuracy using this application by 26.68% in a blurred image compare to standard Tesseract OCR and outperformed Simple OCR in average accuracy testing.

[1]  Yoshua Bengio,et al.  Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.

[2]  Isabelle Guyon,et al.  Comparison of classifier methods: a case study in handwritten digit recognition , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[3]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[4]  Bidyut Baran Chaudhuri,et al.  Handwritten Numeral Databases of Indian Scripts and Multistage Recognition of Mixed Numerals , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[5]  Geoffrey E. Hinton,et al.  Using very deep autoencoders for content-based image retrieval , 2011, ESANN.

[6]  Gonzalo Navarro,et al.  A guided tour to approximate string matching , 2001, CSUR.

[7]  Saurin Sheth,et al.  Text-based Image Segmentation Methodology , 2014 .

[8]  Md Shopon,et al.  Bangla handwritten digit recognition using autoencoder and deep convolutional neural network , 2016, 2016 International Workshop on Computational Intelligence (IWCI).

[9]  Bidyut Baran Chaudhuri,et al.  OCR in Bangla: an Indo-Bangladeshi language , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[10]  Christopher G. Harris,et al.  A Combined Corner and Edge Detector , 1988, Alvey Vision Conference.

[11]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[12]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[13]  Chirag I. Patel,et al.  Optical Character Recognition by Open source OCR Tool Tesseract: A Case Study , 2012 .

[14]  Yoshua Bengio,et al.  Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.