Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation

Scene text detection attracts much attention in computer vision, because it can be widely used in many applications such as real-time text translation, automatic information entry, blind person assistance, robot sensing and so on. Though many methods have been proposed for horizontal and oriented texts, detecting irregular shape texts such as curved texts is still a challenging problem. To solve the problem, we propose a robust scene text detection method with adaptive text region representation. Given an input image, a text region proposal network is first used for extracting text proposals. Then, these proposals are verified and refined with a refinement network. Here, recurrent neural network based adaptive text region representation is proposed for text region refinement, where a pair of boundary points are predicted each time step until no new points are found. In this way, text regions of arbitrary shapes are detected and represented with adaptive number of boundary points. This gives more accurate description of text regions. Experimental results on five benchmarks, namely, CTW1500, TotalText, ICDAR2013, ICDAR2015 and MSRA-TD500, show that the proposed method achieves state-of-the-art in scene text detection.

[1]  Lei Sun,et al.  A robust approach for text detection from natural scene images , 2015, Pattern Recognit..

[2]  Ernest Valveny,et al.  ICDAR 2015 competition on Robust Reading , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[3]  Lianwen Jin,et al.  Deep Matching Prior Network: Toward Tighter Multi-oriented Text Detection , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Xin He,et al.  TextSnake: A Flexible Representation for Detecting Text of Arbitrary Shapes , 2018, ECCV.

[5]  Shijian Lu,et al.  Text Flow: A Unified Text Detection System in Natural Scene Images , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[6]  Xiang Bai,et al.  Detecting Oriented Text in Natural Images by Linking Segments , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Kaiming He,et al.  Mask R-CNN , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8]  Sanja Fidler,et al.  Annotating Object Instances with a Polygon-RNN , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[9]  Wei Liu,et al.  SSD: Single Shot MultiBox Detector , 2015, ECCV.

[10]  Qiangpeng Yang,et al.  IncepText: A New Inception-Text Module with Deformable PSROI Pooling for Multi-Oriented Scene Text Detection , 2018, IJCAI.

[11]  Fei Yin,et al.  Deep Direct Regression for Multi-oriented Scene Text Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[12]  Xuelong Li,et al.  PixelLink: Detecting Scene Text via Instance Segmentation , 2018, AAAI.

[13]  Wenyu Liu,et al.  TextBoxes: A Fast Text Detector with a Single Deep Neural Network , 2016, AAAI.

[14]  Gang Sun,et al.  Squeeze-and-Excitation Networks , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[15]  Chee Seng Chan,et al.  Total-Text: A Comprehensive Dataset for Scene Text Detection and Recognition , 2017, 2017 14th IAPR International Conference on Document Analysis and Recognition (ICDAR).

[16]  Xiang Bai,et al.  Symmetry-based text line detection in natural scenes , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[17]  Kaiming He,et al.  Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[18]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[19]  Lianwen Jin,et al.  Feature Enhancement Network: A Refined Scene Text Detector , 2017, AAAI.

[20]  Xiangyang Xue,et al.  Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[21]  Shuchang Zhou,et al.  EAST: An Efficient and Accurate Scene Text Detector , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[22]  Lianwen Jin,et al.  Curved scene text detection via transverse and longitudinal sequence connection , 2019, Pattern Recognit..

[23]  Xiang Bai,et al.  Mask TextSpotter: An End-to-End Trainable Neural Network for Spotting Text with Arbitrary Shapes , 2018, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[24]  Shuicheng Yan,et al.  Multi-oriented Scene Text Detection via Corner Localization and Region Segmentation , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[25]  Wei Li,et al.  R2 CNN: Rotational Region CNN for Arbitrarily-Oriented Scene Text Detection , 2018, 2018 24th International Conference on Pattern Recognition (ICPR).

[26]  Xu-Cheng Yin,et al.  Robust Text Detection in Natural Scene Images. , 2014, IEEE transactions on pattern analysis and machine intelligence.

[27]  Xiang Bai,et al.  TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[28]  Wenyu Liu,et al.  Multi-oriented Text Detection with Fully Convolutional Networks , 2016, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[29]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Junjie Yan,et al.  FOTS: Fast Oriented Text Spotting with a Unified Network , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[32]  Gui-Song Xia,et al.  Rotation-Sensitive Regression for Oriented Scene Text Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[33]  Zhuowen Tu,et al.  Detecting Texts of Arbitrary Orientations in 1 Natural Images , 2012 .

[34]  Weisi Lin,et al.  Learning Markov Clustering Networks for Scene Text Detection , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.