Language identification from multi-lingual scene text images: a CNN based classifier ensemble approach

Since the past two decades, detecting text regions in complex natural images has emerged as a problem of great interest for the research fraternity. This is because these regions of interest serve as source of information that can be utilized for various purposes. However, these regions may contain texts in multiple languages. Hence, identifying the corresponding language of a detected scene text becomes important for further information processing. Language identification of the text, captured in a wild, is an extremely challenging research field in the domain of scene text recognition. In this paper, a deep learning-based classifier combination approach is proposed to solve the problem of language identification from multi-lingual scene text images. In this work, a minimalist Convolutional Neural Network architecture is used as the base model. Five variants of an input image—three different channels of RGB color model (i.e. R for red, G for green and B for blue) along with RGB itself, and grayscale image are passed through the base model separately. The outcomes of these five models are combined using the classifier combination approaches based on sum rule and product rule. Performances of the proposed model have been evaluated on some standard datasets like KAIST and MLe2e as well as in-house multi-lingual scent text dataset. From the experimental results, it has been observed that the proposed model outperforms some state-of-the-art methods considered here for comparison.

[1]  Venu Govindaraju,et al.  Review of Classifier Combination Methods , 2008, Machine Learning in Document Analysis and Recognition.

[2]  Ram Sarkar,et al.  Multi-lingual Scene Text Detection by Local Histogram Analysis and Selection of Optimal Area for MSER , 2018, CICBA.

[3]  Ayatullah Faruk Mollah,et al.  Multilingual Scene Text Detection Using Gradient Morphology , 2020, Int. J. Comput. Vis. Image Process..

[4]  Ayatullah Faruk Mollah,et al.  Parameter Tuning in MSER for Text Localization in Multi-lingual Camera-Captured Scene Text Images , 2019, Computational Intelligence in Pattern Recognition.

[5]  Faliang Huang,et al.  Integrating Local CNN and Global CNN for Script Identification in Natural Scene Images , 2019, IEEE Access.

[6]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[7]  Partha Pratim Roy,et al.  Script Identification in Natural Scene Image and Video Frame using Attention based Convolutional-LSTM Network , 2018, Pattern Recognit..

[8]  Cheng Shi,et al.  Xi’an tourism destination image analysis via deep learning , 2020, Journal of Ambient Intelligence and Humanized Computing.

[9]  Dimosthenis Karatzas,et al.  Improving patch-based scene text script identification with ensembles of conjoined networks , 2016, Pattern Recognit..

[10]  Subhadip Basu,et al.  Text Localization in Camera Captured Images Using Adaptive Stroke Filter , 2015 .

[11]  Vikrant Bhateja,et al.  A comprehensive handwritten Indic script recognition system: a tree-based approach , 2018, Journal of Ambient Intelligence and Humanized Computing.

[12]  Sudhish N. George,et al.  Tensor based approach for inpainting of video containing sparse text , 2018, Multimedia Tools and Applications.

[13]  Subhadip Basu,et al.  Text localization in camera captured images using fuzzy distance transform based adaptive stroke filter , 2019, Multimedia Tools and Applications.

[14]  P. K. Kavitha,et al.  Content based satellite image retrieval system using fuzzy clustering , 2020, J. Ambient Intell. Humaniz. Comput..

[15]  Hassene Faiedh,et al.  Design of efficient embedded system for road sign recognition , 2019, J. Ambient Intell. Humaniz. Comput..

[16]  Adel M. Alimi,et al.  CNN Based Transfer Learning for Scene Script Identification , 2017, ICONIP.

[17]  Xiang Bai,et al.  Script identification in the wild via discriminative convolutional neural network , 2016, Pattern Recognit..

[18]  Lianwen Jin,et al.  Curved scene text detection via transverse and longitudinal sequence connection , 2019, Pattern Recognit..

[19]  Subhadip Basu,et al.  Script Identification from Camera-Captured Multi-script Scene Text Components , 2018, Advances in Intelligent Systems and Computing.

[20]  Qian Wang,et al.  Learning region-wise deep feature representation for image analysis , 2018, Journal of Ambient Intelligence and Humanized Computing.

[21]  Ayatullah Faruk Mollah,et al.  Multi-lingual scene text detection and language identification , 2020, Pattern Recognit. Lett..

[22]  Vikrant Bhateja,et al.  Handwritten Arabic numerals recognition using convolutional neural network , 2020, J. Ambient Intell. Humaniz. Comput..

[23]  Subhadip Basu,et al.  Multi-lingual Text Localization from Camera Captured Images Based on Foreground Homogenity Analysis , 2018, Advances in Intelligent Systems and Computing.

[24]  Yi Lin,et al.  Detecting Multi-Oriented Text with Corner-based Region Proposals , 2018, Neurocomputing.

[25]  N. Kasthuri,et al.  An efficient recognition system for preserving ancient historical documents of English characters , 2020 .

[26]  Mohamed Deriche,et al.  Classifiers Combination Techniques: A Comprehensive Review , 2018, IEEE Access.

[27]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[28]  Yongdong Zhang,et al.  Convolutional Attention Networks for Scene Text Recognition , 2019, ACM Trans. Multim. Comput. Commun. Appl..

[29]  LiYan,et al.  Convolutional Attention Networks for Scene Text Recognition , 2019 .

[30]  Mita Nasipuri,et al.  A Study of Different Classifier Combination Approaches for Handwritten Indic Script Recognition , 2018, J. Imaging.

[31]  Anirban Mukhopadhyay,et al.  Multi-Lingual Scene Text Detection Using One-Class Classifier , 2019, Int. J. Comput. Vis. Image Process..

[32]  Han Lin,et al.  Review of Scene Text Detection and Recognition , 2020, Archives of Computational Methods in Engineering.

[33]  Fei Yin,et al.  Multi-Oriented and Multi-Lingual Scene Text Detection With Direct Regression , 2018, IEEE Transactions on Image Processing.

[34]  Xiang Bai,et al.  Scene text detection and recognition: recent advances and future trends , 2015, Frontiers of Computer Science.

[35]  Jin Hyung Kim,et al.  Touch TT: Scene Text Extractor Using Touchscreen Interface , 2011 .