Learning Spatially Embedded Discriminative Part Detectors for Scene Character Recognition

Recognizing scene character is extremely challenging due to various interference factors such as character translation, blur and uneven illumination, etc. Considering that characters are composed of a series of parts and different parts attract diverse attentions when people observe a character, we should assign different importance to each part to recognize scene character. In this paper, we propose a discriminative character representation by aggregating the responses of the spatially embedded salient part detectors. Specifically, we first extract the convolution activations from the pre-trained convolutional neural network (CNN). These convolutional activations are considered as the local descriptors of the character parts. Then we learn a set of part detectors and pick the distinctive convolutional activations which respond to the salient parts. Moreover, to alleviate the effect of character translation, rotation and deformation, etc, we assign a response region for each part detector and search the maximal response in this region. Finally, we aggregate the maximal outputs of all the salient part detectors to represent character. The experiments on three datasets show the effectiveness of the proposed method for scene character recognition.

[1]  Andrew Zisserman,et al.  Deep Features for Text Spotting , 2014, ECCV.

[2]  Jiřı́ Matas,et al.  Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Palaiahnakote Shivakumara,et al.  A new method based on bag of filters for character recognition in scene images by learning , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[4]  Zheng Zhang,et al.  Natural Scene Character Recognition Using Robust PCA and Sparse Representation , 2016, 2016 12th IAPR Workshop on Document Analysis Systems (DAS).

[5]  Chunheng Wang,et al.  Learning co-occurrence strokes for scene character recognition based on spatiality embedded dictionary , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[6]  Hartmut Neven,et al.  PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[7]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[8]  Wenyu Liu,et al.  Strokelets: A Learned Multi-scale Representation for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Shijian Lu,et al.  Multilingual scene character recognition with co-occurrence of histogram of oriented gradients , 2016, Pattern Recognit..

[10]  Tong Lu,et al.  Natural Scene character recognition using Markov Random Field , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[11]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[12]  Robinson Piramuthu,et al.  Region-Based Discriminative Feature Pooling for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[13]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[14]  Chunheng Wang,et al.  Stroke Detector and Structure Based Models for Character Recognition: A Comparative Study , 2015, IEEE Transactions on Image Processing.

[15]  Jiri Matas,et al.  Scene Text Localization and Recognition with Oriented Stroke Detection , 2013, 2013 IEEE International Conference on Computer Vision.

[16]  Chunheng Wang,et al.  Scene Text Recognition Using Part-Based Tree-Structured Character Detection , 2013, CVPR 2013.

[17]  Shijian Lu,et al.  Scene Text Recognition Using Co-occurrence of Histogram of Oriented Gradients , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[18]  Santosh Kumar Divvala,et al.  Exemplar Driven Character Recognition in the Wild , 2012, BMVC.

[19]  Xiaodong Yang,et al.  Feature Representations for Scene Text Character Recognition: A Comparative Study , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[20]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[21]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[22]  Manik Varma,et al.  Character Recognition in Natural Images , 2009, VISAPP.

[23]  Chunheng Wang,et al.  Stroke Bank: A High-Level Representation for Scene Character Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.