An FPGA-based Hardware Accelerator for Scene Text Character Recognition

Scene text character recognition is a challenging task in Computer Vision since natural scene images usually have cluttered background and the character’s size, font, orientation, texture, brightness, and alignment in the picture are variable and non-predictable. Furthermore, most systems including scene text character recognition are usually embedded in a system on a chip (SoC), which has critical requirements, such as low latency, low area, mobility, and flexibility, at the same time that they require high accuracy. In this context, in this work we propose a heterogeneous system for embedded applications with time, area and power constraints, that combines hardware and software to accelerate a technique for scene text character recognition, based on Histogram of Oriented Gradients (HOG) for feature extraction and a neural network Extreme Learning Machine (ELM) as a classifier. The system was prototyped and experimented in the Terasic embedded platform DE2i-150 and the results showed that the system has accuracy of 65.5% in the Chars74k-15 dataset and is able to process up to 11 frames per second, having a good trade-off between processing time and accuracy in embedded environments. Moreover, it occupies only 11% logic elements of the Altera Cyclone IV FPGA, enabling its use in embedded systems.

[1]  Baihua Xiao,et al.  Deep Contextual Stroke Pooling for Scene Character Recognition , 2018, IEEE Access.

[2]  Hassan Foroosh,et al.  Natural Scene Character Recognition Without Dependency on Specific Features , 2015, VISAPP.

[3]  Chunheng Wang,et al.  End-to-end scene text recognition using tree-structured models , 2014, Pattern Recognit..

[4]  Yingli Tian,et al.  Recognizing Text-Based Traffic Guide Panels with Cascaded Localization Network , 2016, ECCV Workshops.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Hanzi Wang,et al.  Scene Character and Text Recognition: The State-of-the-Art , 2015, ICIG.

[7]  Viktor Prasanna,et al.  Frequency Domain Acceleration of Convolutional Neural Networks on CPU-FPGA Shared Memory System , 2017, FPGA.

[8]  Manik Varma,et al.  Character Recognition in Natural Images , 2009, VISAPP.

[9]  Chee Kheong Siew,et al.  Extreme learning machine: Theory and applications , 2006, Neurocomputing.

[10]  Chunheng Wang,et al.  Fisher vector for scene character recognition: A comprehensive evaluation , 2017, Pattern Recognit..

[11]  R. Keys Cubic convolution interpolation for digital image processing , 1981 .

[12]  Maharashtra India,et al.  SCENE TEXT RECOGNITION IN MOBILE APPLICATIONS BY CHARACTER DESCRIPTOR AND STRUCTURE CONFIGURATION , 2015 .

[13]  Yoav Freund,et al.  RIFFA: A Reusable Integration Framework for FPGA Accelerators , 2012, 2012 IEEE 20th International Symposium on Field-Programmable Custom Computing Machines.

[14]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[15]  Luigi Palopoli,et al.  A Smart Walking Assistant for Safe Navigation in Complex Indoor Environments , 2015 .

[16]  Bill Triggs,et al.  Histograms of oriented gradients for human detection , 2005, 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'05).

[17]  Yunfeng Peng,et al.  A RMB optical character recognition system using FPGA , 2016, 2016 IEEE International Conference on Signal and Image Processing (ICSIP).

[18]  Andreas G. Andreou,et al.  FPGA implementation of a Deep Belief Network architecture for character recognition using stochastic computation , 2015, 2015 49th Annual Conference on Information Sciences and Systems (CISS).