Text Recognition on Images from Social Media

Text recognition problem has been studied many years. A few OCR engines exist, which successfully solve the problem for many languages. But these engines work well only with high quality scanned images. Social networks nowadays contain large number of images that need to analyze and recognize the text contained in them, but they have different quality: mixed text with images, poor quality images taken from camera of smartphone, etc. In this paper a text extraction pipeline is provided to address text extraction from various quality images collected form social media. Input images are categorized into different classes and then class specific preprocessing is applied to them (illumination improvement, text localization etc.). Then OCR engine used to recognize text. In the paper we present results of our experiments on dataset collected from social media.

[1]  R. Smith,et al.  An Overview of the Tesseract OCR Engine , 2007, Ninth International Conference on Document Analysis and Recognition (ICDAR 2007).

[2]  Jian Sun,et al.  Single image haze removal using dark channel prior , 2009, CVPR.

[3]  Xiangyang Xue,et al.  Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[4]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  Yoshua Bengio,et al.  Generative Adversarial Networks , 2014, ArXiv.

[7]  Li Fei-Fei,et al.  Perceptual Losses for Real-Time Style Transfer and Super-Resolution , 2016, ECCV.

[8]  Thomas M. Breuel,et al.  High-Performance OCR for Printed English and Fraktur Using LSTM Networks , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[9]  Thomas M. Breuel,et al.  The OCRopus open source OCR system , 2008, Electronic Imaging.

[10]  John Salvatier,et al.  Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[11]  Stephen V. Rice,et al.  The Fourth Annual Test of OCR Accuracy , 1995 .

[12]  Takio Kurita,et al.  Fast and Accurate Image Super Resolution by Deep CNN with Skip Connection and Network in Network , 2017, ICONIP.

[13]  Xiaoou Tang,et al.  Single Image Haze Removal Using Dark Channel Prior , 2011 .

[14]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[15]  Hui Ding,et al.  Improvement of low illumination image enhancement algorithm based on physical mode , 2014 .

[16]  Shuchang Zhou,et al.  Scene Text Detection via Holistic, Multi-Channel Prediction , 2016, ArXiv.

[17]  Faisal Shafait,et al.  Real-Time Document Localization in Natural Images by Recursive Application of a CNN , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[18]  Xuelong Li,et al.  PixelLink: Detecting Scene Text via Instance Segmentation , 2018, AAAI.