A novel pipeline framework for multi oriented scene text image detection and recognition

Abstract Automatic text detection and recognition (end-to-end text recognition) in real-life images are the main elements of many applications including blind and low vision assistance systems and self-driving cars. However, it is challenging to detect curved and vertical texts due to their color bleeding, font size variation, and complicated background. In this paper, a convolutional neural network-based pipeline is introduced to obtain high-level visual features and improve text detection and recognition efficiency. A pre-trained ResNet-50 network on ImageNet and SynthText for extracting low-level visual features was used in this study. Moreover, new improved ReLU layer (new.i.ReLU) blocks are used with a varied receptive field with a strong ability to detect text components even on curved surfaces in the proposed structure. A new improved inception layer (new.i.inception layers) can obtain broadly varying-sized text more effectively than a linear chain of convolution layer. Also, we have proposed a pipeline framework for character recognition that is robust to irregular (curve and vertical) text. First, we introduced a novel algorithm for encoding pixel’s value to a new one called local word directional pattern (LWDP) that highlights the texture of the characters. Then, the output of LWDP was presented as an input image in the text recognition process. The experiments on standard benchmarks, including ICDAR 2013, ICDAR 2015, and ICDAR 2019 datasets, illustrated the superiority of the proposed architecture over prior works.

[1]  Seiichi Uchida,et al.  Mining the displacement of max-pooling for text recognition , 2019, Pattern Recognit..

[2]  Khalid M. Amin,et al.  A novel breast tumor classification algorithm using neutrosophic score features , 2016 .

[3]  Oksam Chae,et al.  Local Directional Texture Pattern image descriptor , 2015, Pattern Recognit. Lett..

[4]  Wenyu Liu,et al.  A Unified Framework for Multioriented Text Detection and Recognition , 2014, IEEE Transactions on Image Processing.

[5]  Salwa Hanim Abdul-Rashid,et al.  Economic order quantity models for items with imperfect quality and emission considerations , 2018 .

[6]  Fagui Liu,et al.  FTPN: Scene Text Detection With Feature Pyramid Based Text Proposal Network , 2019, IEEE Access.

[7]  Xiangjian He,et al.  FACLSTM: ConvLSTM with focused attention for scene text recognition , 2019, Science China Information Sciences.

[8]  Abolfazl Gharaei,et al.  Joint Economic Lot-sizing in Multi-product Multi-level Integrated Supply Chains: Generalized Benders Decomposition , 2020, International Journal of Systems Science: Operations & Logistics.

[9]  A. Suruliandi,et al.  Local texture description framework-based modified local directional number pattern: a new descriptor for face recognition , 2015, Int. J. Biom..

[10]  Bibhas C. Giri,et al.  Developing a closed-loop supply chain model with price and quality dependent demand and learning in production in a stochastic environment , 2018, International Journal of Systems Science: Operations & Logistics.

[11]  Seyed Ashkan Hoseini Shekarabi,et al.  An integrated stochastic EPQ model under quality and green policies: generalised cross decomposition under the separability approach , 2019, International Journal of Systems Science: Operations & Logistics.

[12]  Wei Jia,et al.  Local line directional pattern for palmprint recognition , 2016, Pattern Recognit..

[13]  Xiang Bai,et al.  TextBoxes++: A Single-Shot Oriented Scene Text Detector , 2018, IEEE Transactions on Image Processing.

[14]  Andrew Zisserman,et al.  Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[15]  Jie Yuan,et al.  A method for text line detection in natural images , 2013, Multimedia Tools and Applications.

[16]  Abolfazl Gharaei,et al.  Modelling and optimal lot-sizing of integrated multi-level multi-wholesaler supply chains under the shortage and limited warehouse space: generalised outer approximation , 2019 .

[17]  Farrokh Mistree,et al.  Managing computational complexity using surrogate models: a critical review , 2020, Research in Engineering Design.

[18]  Cheng-Lin Liu,et al.  Realtime multi-scale scene text detection with scale-based region proposal network , 2020, Pattern Recognit..

[19]  Vahid Ghods,et al.  An efficient character recognition method using enhanced HOG for spam image detection , 2019, Soft Comput..

[20]  Tohid Alizadeh,et al.  An intelligent system for quality measurement of Golden Bleached raisins using two comparative machine learning algorithms , 2017 .

[21]  Jun Sun,et al.  A novel text structure feature extractor for Chinese scene text detection and recognition , 2017, 2016 23rd International Conference on Pattern Recognition (ICPR).

[22]  Changxin Gao,et al.  Text detection approach based on confidence map and context information , 2015, Neurocomputing.

[23]  Wei Zhu,et al.  Scene text detection via extremal region based double threshold convolutional network classification , 2017, PloS one.

[24]  David S. Doermann,et al.  Text Detection and Recognition in Imagery: A Survey , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[25]  Ramin Ranjbarzadeh,et al.  Automated liver and tumor segmentation based on concave and convex points using fuzzy c-means and mean shift clustering , 2020 .

[26]  Xiang Bai,et al.  Text/non-text image classification in the wild with convolutional neural networks , 2017, Pattern Recognit..

[27]  Kaizhu Huang,et al.  Robust Text Detection in Natural Scene Images , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Byungyong Ryu,et al.  Local Directional Ternary Pattern for Facial Expression Recognition. , 2017, IEEE transactions on image processing : a publication of the IEEE Signal Processing Society.

[29]  C. V. Jawahar,et al.  Scene Text Recognition using Higher Order Language Priors , 2009, BMVC.

[30]  Matti Pietikäinen,et al.  Median Robust Extended Local Binary Pattern for Texture Classification , 2016, IEEE Trans. Image Process..

[31]  Lei Sun,et al.  An anchor-free region proposal network for Faster R-CNN-based text detection approaches , 2018, International Journal on Document Analysis and Recognition (IJDAR).

[32]  Angappa Gunasekaran,et al.  Building theory of sustainable manufacturing using total interpretive structural modelling , 2015 .

[33]  Lianwen Jin,et al.  A Multi-Object Rectified Attention Network for Scene Text Recognition , 2019, Pattern Recognit..

[34]  Abolfazl Gharaei,et al.  An integrated multi-product, multi-buyer supply chain under penalty, green, and quality control polices and a vendor managed inventory with consignment stock agreement: The outer approximation with equality relaxation and augmented penalty algorithm , 2019, Applied Mathematical Modelling.

[35]  Farrokh Mistree,et al.  A rule-based method for automated surrogate model selection , 2020, Adv. Eng. Informatics.

[36]  Changxin Gao,et al.  LEDTD: Local edge direction and texture descriptor for face recognition , 2016, Signal Process. Image Commun..

[37]  R A Kirsch,et al.  Computer determination of the constituent structure of biological images. , 1971, Computers and biomedical research, an international journal.

[38]  Xiang Bai,et al.  An End-to-End Trainable Neural Network for Image-Based Sequence Recognition and Its Application to Scene Text Recognition , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[39]  Seyed Ashkan Hoseini Shekarabi,et al.  Modelling And optimal lot-sizing of the replenishments in constrained, multi-product and bi-objective EPQ models with defective products: Generalised Cross Decomposition , 2020, International Journal of Systems Science: Operations & Logistics.

[40]  Xiangyang Xue,et al.  Arbitrary-Oriented Scene Text Detection via Rotation Proposals , 2017, IEEE Transactions on Multimedia.

[41]  Vahid Ghods,et al.  Scene text detection using enhanced Extremal region and convolutional neural network , 2020, Multimedia Tools and Applications.