SRR-GAN: Super-Resolution based Recognition with GAN for Low-Resolved Text Images

Text images convey important information for various applications, while the recognition of low-resolution text images is a challenge. Most existing methods solve this problem using a cascaded scheme in two steps: image super-resolution and high-resolution text recognition. In this paper, we propose a novel framework, called SRR-GAN, which integrates text recognition with super-resolution via adversarial learning. By joint training of recognition and super-resolution models, more generic features for images of various quality can be learned, so as to yield high recognition performance for both high-resolution and low-resolution images. Experiments on natural scene and handwritten texts demonstrate that SRR-GAN outperforms the cascaded scheme on low-resolution images. The results show that SRR-GAN can improve recognition accuracies by 10%-20% relatively on five datasets of scene/handwritten texts. Meanwhile, SRR-GAN maintains high performance on high-resolution images.

[1]  Xiang Bai,et al.  An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Kai Wang,et al.  End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[3]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4]  Narendra Ahuja,et al.  Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5]  Rini Wongso,et al.  Evaluation of Deep Super Resolution Methods for Textual Images , 2018 .

[6]  Xiang Bai,et al.  ASTER: An Attentional Scene Text Recognizer with Flexible Rectification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7]  L. Rudin,et al.  Nonlinear total variation based noise removal algorithms , 1992 .

[8]  Lianwen Jin,et al.  A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[9]  Jerod J. Weinman,et al.  Toward Integrated Scene Text Reading , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Hartmut Neven,et al.  PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[11]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[12]  Dong Liu,et al.  CNN-based text image super-resolution tailored for OCR , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[13]  Horst Bunke,et al.  The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[14]  Jürgen Schmidhuber,et al.  Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[15]  Xiaoou Tang,et al.  Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[16]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[17]  Xiaoou Tang,et al.  Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Lianwen Jin,et al.  Adaptive Embedding Gate for Attention-Based Scene Text Recognition , 2020, Neurocomputing.

[19]  Yi-Chao Wu,et al.  Scene Text Recognition with Sliding Convolutional Character Models , 2017, ArXiv.

[20]  Andrew Zisserman,et al.  Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[21]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[22]  Christian Ledig,et al.  Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23]  Yuyang Wang,et al.  Super-Resolution of Text Image Based on Conditional Generative Adversarial Network , 2018, PCM.

[24]  Daniel Rueckert,et al.  Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25]  Gregory Shakhnarovich,et al.  Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.