Cursive Scene Text Analysis by Deep Convolutional Linear Pyramids

The camera captured images have various aspects to investigate. Generally, the emphasis of research depends on the interesting regions. Sometimes the focus could be on color segmentation, object detection or scene text analysis. The image analysis, visibility and layout analysis are the tasks easier for humans as suggested by behavioural trait of humans, but in contrast when these same tasks are supposed to perform by machines then it seems to be challenging. The learning machines always learn from the properties associated to provided samples. The numerous approaches are designed in recent years for scene text extraction and recognition and the efforts are underway to improve the accuracy. The convolutional approach provided reasonable results on non-cursive text analysis appeared in natural images. The work presented in this manuscript exploited the strength of linear pyramids by considering each pyramid as a feature of the provided sample. Each pyramid image process through various empirically selected kernels. The performance was investigated by considering Arabic text on each image pyramid of EASTR-42k dataset. The error rate of 0.17% was reported on Arabic scene text recognition.

[1]  Jin Hyung Kim,et al.  Scene Text Extraction with Edge Constraint and Text Collinearity , 2010, 2010 20th International Conference on Pattern Recognition.

[2]  Imran Siddiqi,et al.  Segmentation techniques for recognition of Arabic-like scripts: A comprehensive survey , 2015, Education and Information Technologies.

[3]  Muhammad Sher,et al.  HMM and fuzzy logic: A hybrid approach for online Urdu script-based languages' character recognition , 2010, Knowl. Based Syst..

[4]  Saeeda Naz,et al.  Arabic Script based Digit Recognition Systems , 2016 .

[5]  Samee Ullah Khan,et al.  The optical character recognition of Urdu-like cursive scripts , 2014, Pattern Recognit..

[6]  Trevor Darrell,et al.  The pyramid match kernel: discriminative classification with sets of image features , 2005, Tenth IEEE International Conference on Computer Vision (ICCV'05) Volume 1.

[7]  Cordelia Schmid,et al.  Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[8]  Cordelia Schmid,et al.  Local Features and Kernels for Classification of Texture and Object Categories: A Comprehensive Study , 2006, 2006 Conference on Computer Vision and Pattern Recognition Workshop (CVPRW'06).

[9]  Zhuowen Tu,et al.  Detecting Texts of Arbitrary Orientations in 1 Natural Images , 2012 .

[10]  Muhammad Imran Razzak,et al.  Evaluation of cursive and non-cursive scripts using recurrent neural networks , 2015, Neural Computing and Applications.

[11]  Imran Siddiqi,et al.  Urdu Nastaliq recognition using convolutional-recursive deep learning , 2017, Neurocomputing.

[12]  Alex Graves,et al.  Supervised Sequence Labelling with Recurrent Neural Networks , 2012, Studies in Computational Intelligence.

[13]  Muhammad Imran Razzak,et al.  Handwritten Urdu character recognition using one-dimensional BLSTM classifier , 2017, Neural Computing and Applications.

[14]  Chew Lim Tan,et al.  Agent-Based Text Extraction from Pyramid Images , 1999 .

[15]  Muhammad Imran Razzak,et al.  Deep learning based isolated Arabic scene character recognition , 2017, 2017 1st International Workshop on Arabic Script Analysis and Recognition (ASAR).

[16]  Florent Perronnin,et al.  Modeling the spatial layout of images beyond spatial pyramids , 2012, Pattern Recognit. Lett..

[17]  Muhammad Imran Razzak,et al.  UCOM offline dataset-an urdu handwritten dataset generation , 2017, Int. Arab J. Inf. Technol..

[18]  Joshua Gluckman,et al.  Scale Variant Image Pyramids , 2006, 2006 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'06).

[19]  Christophe Garcia,et al.  ALIF: A dataset for Arabic embedded text recognition in TV broadcast , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[20]  Yihong Gong,et al.  Linear spatial pyramid matching using sparse coding for image classification , 2009, CVPR.

[21]  Muhammad Imran Razzak,et al.  Urdu Nasta’liq text recognition using implicit segmentation based on multi-dimensional long short term memory neural networks , 2016, SpringerPlus.