论文信息 - Character Region Awareness Network For Scene Text Recognition

Character Region Awareness Network For Scene Text Recognition

Recognizing text in natural scenes is still a very challenging task, due to arbitrary shapes, varying fonts, complex backgrounds and so on. Recently, some recognizers utilize Spatial Transform Network (STN) to rectify irregular text instances and achieve promising results. However, their robustness and accuracy are still limited, since rectification performance can be easily degraded by challenging samples. To tackle this issue, we propose a simple yet effective two-dimensional (2D) character attention module, which can enhance foreground text instances via character region awareness. By incorporating this with existing rectification pipeline, we build a novel scene text recognizer named Character Region Awareness Network (CRAN). Extensive experiments demonstrate that our CRAN outperforms previous methods nearly on all benchmarks of both regular and irregular text, particularly on SVT (+2.0%), SVTP (+1.5%) and CUTE80 (+2.1%).

[1] Xiang Bai,et al. An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2] Kai Wang,et al. End-to-end scene text recognition , 2011, 2011 International Conference on Computer Vision.

[3] Dongyoon Han,et al. Character Region Awareness for Text Detection , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[4] Simon M. Lucas,et al. ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5] Ankush Gupta,et al. Synthetic Data for Text Localisation in Natural Images , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6] Kai Wang,et al. Word Spotting in the Wild , 2010, ECCV.

[7] Wenyu Liu,et al. Strokelets: A Learned Multi-scale Representation for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[8] Andrew Zisserman,et al. Synthetic Data and Artificial Neural Networks for Natural Scene Text Recognition , 2014, ArXiv.

[9] Zihan Zhou,et al. Learning to Read Irregular Text with Attention Mechanisms , 2017, IJCAI.

[10] Yonatan Wexler,et al. Detecting text in natural scenes with stroke width transform , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[11] Kaigui Bian,et al. Symmetry-Constrained Rectification Network for Scene Text Recognition , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[12] Jian Zhang,et al. Scene Text Recognition from Two-Dimensional Perspective , 2018, AAAI.

[13] Palaiahnakote Shivakumara,et al. Recognizing Text with Perspective Distortion in Natural Scenes , 2013, 2013 IEEE International Conference on Computer Vision.

[14] Xiang Bai,et al. ASTER: An Attentional Scene Text Recognizer with Flexible Rectification , 2019, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[15] Palaiahnakote Shivakumara,et al. A robust arbitrary text detection system for natural scene images , 2014, Expert Syst. Appl..

[16] Jon Almazán,et al. ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[17] Yang Liu,et al. Synthetically Supervised Feature Learning for Scene Text Recognition , 2018, ECCV.

[18] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[19] Simon Osindero,et al. Recursive Recurrent Nets with Attention Modeling for OCR in the Wild , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] C. V. Jawahar,et al. Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[21] Ernest Valveny,et al. ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[22] Hao Yu,et al. SqueezedText: A Real-Time Scene Text Recognition by Binary Convolutional Encoder-Decoder Network , 2018, AAAI.