论文信息 - Scale robust deep oriented-text detection network

Scale robust deep oriented-text detection network

Abstract Text detection is a prerequisite of text recognition, and multi-oriented text detection is a hot topic recently. The existing multi-oriented text detection methods fall short when facing two issues: 1) text scales change in a wide range, and 2) there exists the foreground-background class imbalance. In this paper, we propose a scale-robust deep multi-oriented text-detection model, which not only has the efficiency of the one-stage deep detection model, but also has the comparable accuracy of the two-stage deep text-detection model. We design the feature refining block to fuse multi-scale context features for the purpose of keeping text detection in a higher-resolution feature map. Moreover, in order to mitigate the foreground-background class imbalance, Focal Loss is adopted to up weight the hard-classified samples. Our method is implemented on four benchmark text datasets: ICDAR2013, ICDAR2015, COCO-Text and MSRA-TD500. The experimental results demonstrate that our method is superior to the existing one-stage deep text-detection models and comparable to the state-of-the-art text detection methods.

[1] Yuting Gao,et al. Fused Text Segmentation Networks for Multi-oriented Scene Text Detection , 2017, 2018 24th International Conference on Pattern Recognition (ICPR).

[2] Larry S. Davis,et al. An Analysis of Scale Invariance in Object Detection - SNIP , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[3] Qiangpeng Yang,et al. IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection , 2018, IJCAI.

[4] Wei Zhou,et al. TextField: Learning a Deep Direction Field for Irregular Scene Text Detection , 2018, IEEE Transactions on Image Processing.

[5] Ernest Valveny,et al. ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[6] Junjie Yan,et al. FOTS: Fast Oriented Text Spotting with a Unified Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[7] Seiichi Uchida,et al. Could scene context be beneficial for scene text detection? , 2016, Pattern Recognit..

[8] Andrew Zisserman,et al. Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[9] Shuchang Zhou,et al. EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[11] Jon Almazán,et al. ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[12] Ross B. Girshick,et al. Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13] Qiang Huo,et al. Improved localization accuracy by LocNet for Faster R-CNN based text detection in natural scene images , 2019, Pattern Recognit..

[14] Xiang Bai,et al. Detecting Oriented Text in Natural Images by Linking Segments , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Lei Sun,et al. An anchor-free region proposal network for Faster R-CNN-based text detection approaches , 2018, International Journal on Document Analysis and Recognition (IJDAR).

[16] Gui-Song Xia,et al. Rotation-Sensitive Regression for Oriented Scene Text Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[17] Zhuowen Tu,et al. Detecting Texts of Arbitrary Orientations in 1 Natural Images , 2012 .

[18] Jun Du,et al. Sliding Line Point Regression for Shape Robust Scene Text Detection , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[19] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[20] Xin He,et al. TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes , 2018, ECCV.

[21] Shuicheng Yan,et al. Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22] Sébastien Ourselin,et al. Generalised Dice overlap as a deep learning loss function for highly unbalanced segmentations , 2017, DLMIA/ML-CDS@MICCAI.

[23] Wei Liu,et al. SSD: Single Shot MultiBox Detector , 2015, ECCV.

[24] Kaiming He,et al. Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25] Xiang Bai,et al. SegLink++: Detecting Dense and Arbitrary-shaped Scene Text by Instance-aware Component Grouping , 2019, Pattern Recognit..

[26] Xiang Li,et al. Shape Robust Text Detection With Progressive Scale Expansion Network , 2018, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[27] Ali Farhadi,et al. You Only Look Once: Unified, Real-Time Object Detection , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[28] Wenyu Liu,et al. A Unified Framework for Multioriented Text Detection and Recognition , 2014, IEEE Transactions on Image Processing.

[29] Abhinav Gupta,et al. Training Region-Based Object Detectors with Online Hard Example Mining , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30] Shifeng Zhang,et al. Single-Shot Refinement Neural Network for Object Detection , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[31] Xuelong Li,et al. PixelLink: Detecting Scene Text via Instance Segmentation , 2018, AAAI.

[32] Lianwen Jin,et al. Curved scene text detection via transverse and longitudinal sequence connection , 2019, Pattern Recognit..

[33] Xiang Bai,et al. Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes , 2021, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[34] Xiang Bai,et al. TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[35] Ting Liu,et al. Recent advances in convolutional neural networks , 2015, Pattern Recognit..

[36] Guosheng Lin,et al. RefineNet: Multi-Path Refinement Networks for Dense Prediction , 2020, IEEE Transactions on Pattern Analysis and Machine Intelligence.