Bilateral Convolutional Activations Encoded with Fisher Vectors for Scene Character Recognition

A rich and robust representation for scene characters plays a significant role in automatically understanding the text in images. In this letter, we focus on the issue of feature representation, and propose a novel encoding method named bilateral convolutional activations encoded with Fisher vectors (BCA-FV) for scene character recognition. Concretely, we first extract convolutional activation descriptors from convolutional maps and then build a bilateral convolutional activation map (BCAM) to capture the relationship between the convolutional activation response and the spatial structure information. Finally, in order to obtain the global feature representation, the BCAM is injected into FV to encode convolutional activation descriptors. Hence, the BCA-FV can effectively integrate the prominent features and spatial structure information for character representation. We verify our method on two widely used databases (ICDAR2003 and Chars74K), and the experimental results demonstrate that our method achieves better results than the state-of-the-art methods. In addition, we further validate the proposed BCA-FV on the “Pan+ChiPhoto” database for Chinese scene character recognition, and the experimental results show the good generalization ability of the proposed BCA-FV. key words: bilateral convolutional activations, Fisher vectors, scene character recognition

[1]  Chunheng Wang,et al.  Stroke Detector and Structure Based Models for Character Recognition: A Comparative Study , 2015, IEEE Transactions on Image Processing.

[2]  Chunheng Wang,et al.  Multi-order co-occurrence activations encoded with Fisher Vector for scene character recognition , 2017, Pattern Recognit. Lett..

[3]  Subhransu Maji,et al.  Deep filter banks for texture recognition and segmentation , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[4]  Andrew Zisserman,et al.  Deep Features for Text Spotting , 2014, ECCV.

[5]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[6]  Shijian Lu,et al.  Multilingual scene character recognition with co-occurrence of histogram of oriented gradients , 2016, Pattern Recognit..

[7]  Hartmut Neven,et al.  PhotoOCR: Reading Text in Uncontrolled Conditions , 2013, 2013 IEEE International Conference on Computer Vision.

[8]  Wenyu Liu,et al.  Strokelets: A Learned Multi-scale Representation for Scene Text Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[9]  Jiřı́ Matas,et al.  Real-time scene text localization and recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[10]  Tao Wang,et al.  End-to-end text recognition with convolutional neural networks , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).

[11]  C. V. Jawahar,et al.  Top-down and bottom-up cues for scene text recognition , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[12]  Victor S. Lempitsky,et al.  Aggregating Local Deep Features for Image Retrieval , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[13]  Simon M. Lucas,et al.  ICDAR 2003 robust reading competitions , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[14]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, Proceedings of the 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2004. CVPR 2004..

[15]  Jerod J. Weinman,et al.  Toward Integrated Scene Text Reading , 2014, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[16]  Florent Perronnin,et al.  Fisher Kernels on Visual Vocabularies for Image Categorization , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.

[17]  Lewis D. Griffin,et al.  Multiscale Histogram of Oriented Gradient Descriptors for Robust Character Recognition , 2011, 2011 International Conference on Document Analysis and Recognition.

[18]  Chunheng Wang,et al.  Scene Text Character Recognition Using Spatiality Embedded Dictionary , 2014, IEICE Trans. Inf. Syst..

[19]  Chunheng Wang,et al.  Fisher vector for scene character recognition: A comprehensive evaluation , 2017, Pattern Recognit..

[20]  Manik Varma,et al.  Character Recognition in Natural Images , 2009, VISAPP.

[21]  Chunheng Wang,et al.  Stroke Bank: A High-Level Representation for Scene Character Recognition , 2014, 2014 22nd International Conference on Pattern Recognition.

[22]  Chunheng Wang,et al.  Scene Text Recognition Using Part-Based Tree-Structured Character Detection , 2013, CVPR 2013.

[23]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.