A feature learning method for scene text recognition

Reading text in scene images is a challenging task and is still an active research nowadays. The difficulties come from low resolution, complex background, non uniform lightning or blurring effects of scene images. This paper focuses on recognizing characters in scene images based on the feature learning method proposed in [6] and the conclusion on comparison between sparse coding and vector quantization in [8] to build better feature representations before training the model by using SVM. We asset the performance of the proposed method on some popular scene image datasets such as ICDAR 20033 and Chars74k4. Experimental results show that our proposed system has reached an encouraging recognition rate for both ICDAR 2003 and Chars74k datasets. More specially, our system archived 83.8% (62-class problem), 87% (36-class problem) of recognition rate on ICDAR 2003 Sample subset (698 images), and 73.8% accuracy on GoodImg subset (7705 images) of Chars74K dataset. In this work, our contribution is that we applied the ideas as well as the conclusions in [8] for scene text recognition problem and the experimental results show that our system outperforms other existing methods.

[1]  Zohra Saidane,et al.  Automatic Scene Text Recognition using a Convolutional Neural Network , 2007 .

[2]  Andrew Y. Ng,et al.  Selecting Receptive Fields in Deep Networks , 2011, NIPS.

[3]  Jiri Matas,et al.  A Method for Text Localization and Recognition in Real-World Images , 2010, ACCV.

[4]  K. Lange,et al.  Coordinate descent algorithms for lasso penalized regression , 2008, 0803.3876.

[5]  Andrew Y. Ng,et al.  Text Detection and Character Recognition in Scene Images with Unsupervised Feature Learning , 2011, 2011 International Conference on Document Analysis and Recognition.

[6]  Manik Varma,et al.  Character Recognition in Natural Images , 2009, VISAPP.

[7]  Y. C. Pati,et al.  Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition , 1993, Proceedings of 27th Asilomar Conference on Signals, Systems and Computers.

[8]  Andrew Y. Ng,et al.  The Importance of Encoding Versus Training with Sparse Coding and Vector Quantization , 2011, ICML.

[9]  Rajat Raina,et al.  Efficient sparse coding algorithms , 2006, NIPS.

[10]  Toru Wakahara,et al.  Binarization and Recognition of Degraded Characters Using a Maximum Separability Axis in Color Space and GAT Correlation , 2006, 18th International Conference on Pattern Recognition (ICPR'06).

[11]  Cor J. Veenman,et al.  Kernel Codebooks for Scene Categorization , 2008, ECCV.

[12]  Jean Ponce,et al.  Learning mid-level features for recognition , 2010, 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[13]  Le Li,et al.  SENSC: a Stable and Efficient Algorithm for Nonnegative Sparse Coding: SENSC: a Stable and Efficient Algorithm for Nonnegative Sparse Coding , 2009 .

[14]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Honglak Lee,et al.  An Analysis of Single-Layer Networks in Unsupervised Feature Learning , 2011, AISTATS.

[16]  Honglak Lee,et al.  Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.

[17]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[18]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  W. Effelsberg,et al.  Robust Character Recognition in Low-Resolution Images and Videos , 2005 .

[20]  Hao Su,et al.  Object Bank: A High-Level Image Representation for Scene Classification & Semantic Feature Sparsification , 2010, NIPS.

[21]  T. Blumensath,et al.  On the Difference Between Orthogonal Matching Pursuit and Orthogonal Least Squares , 2007 .

[22]  Andreas Krause,et al.  Advances in Neural Information Processing Systems (NIPS) , 2014 .