论文信息 - SRR-GAN: Super-Resolution based Recognition with GAN for Low-Resolved Text Images

SRR-GAN: Super-Resolution based Recognition with GAN for Low-Resolved Text Images

Text images convey important information for various applications, while the recognition of low-resolution text images is a challenge. Most existing methods solve this problem using a cascaded scheme in two steps: image super-resolution and high-resolution text recognition. In this paper, we propose a novel framework, called SRR-GAN, which integrates text recognition with super-resolution via adversarial learning. By joint training of recognition and super-resolution models, more generic features for images of various quality can be learned, so as to yield high recognition performance for both high-resolution and low-resolution images. Experiments on natural scene and handwritten texts demonstrate that SRR-GAN outperforms the cascaded scheme on low-resolution images. The results show that SRR-GAN can improve recognition accuracies by 10%-20% relatively on five datasets of scene/handwritten texts. Meanwhile, SRR-GAN maintains high performance on high-resolution images.

[1] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[3] Simon M. Lucas,et al. ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[4] Narendra Ahuja,et al. Deep Laplacian Pyramid Networks for Fast and Accurate Super-Resolution , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[5] Rini Wongso,et al. Evaluation of Deep Super Resolution Methods for Textual Images , 2018 .

[6] Xiang Bai,et al. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[7] L. Rudin,et al. Nonlinear total variation based noise removal algorithms , 1992 .

[8] Lianwen Jin,et al. A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[9] Jerod J. Weinman,et al. Toward Integrated Scene Text Reading , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10] Hartmut Neven,et al. PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[11] Jon Almazán,et al. ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[12] Dong Liu,et al. CNN-based text image super-resolution tailored for OCR , 2017, 2017 IEEE Visual Communications and Image Processing (VCIP).

[13] Horst Bunke,et al. The IAM-database: an English sentence database for offline handwriting recognition , 2002, International Journal on Document Analysis and Recognition.

[14] Jürgen Schmidhuber,et al. Connectionist temporal classification: labelling unsegmented sequence data with recurrent neural networks , 2006, ICML.

[15] Xiaoou Tang,et al. Learning a Deep Convolutional Network for Image Super-Resolution , 2014, ECCV.

[16] C. V. Jawahar,et al. Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[17] Xiaoou Tang,et al. Image Super-Resolution Using Deep Convolutional Networks , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18] Lianwen Jin,et al. Adaptive Embedding Gate for Attention-Based Scene Text Recognition , 2020, Neurocomputing.

[19] Yi-Chao Wu,et al. Scene Text Recognition with Sliding Convolutional Character Models , 2017, ArXiv.

[20] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[21] Tao Wang,et al. End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[22] Christian Ledig,et al. Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[23] Yuyang Wang,et al. Super-Resolution of Text Image Based on Conditional Generative Adversarial Network , 2018, PCM.

[24] Daniel Rueckert,et al. Real-Time Single Image and Video Super-Resolution Using an Efficient Sub-Pixel Convolutional Neural Network , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[25] Gregory Shakhnarovich,et al. Deep Back-Projection Networks for Super-Resolution , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.