Ligature Recognition in Urdu Caption Text using Deep Convolutional Neural Networks

Textual content in videos contain rich information that can be exploited for semantic indexing and subsequent retrieval as well as development of video analytics solutions. The key modules in a textual content based video retrieval system include detection (localization) of text followed by its recognition, the later being the subject of our study. More specifically, this paper presents a caption text recognition system targeting Urdu text. The technique relies on a holistic approach using ligatures as units of recognition. Data driven feature extraction techniques are employed using a number of pre-trained deep convolution neural networks. The networks are used as feature extractors as well as fine-tuned on the ligature dataset under study and realized high ligature recognition rates.

[1]  Sarmad Hussain,et al.  Adapting Tesseract for Complex Scripts: An Example for Urdu Nastalique , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.

[2]  Adel M. Alimi,et al.  A Comprehensive Method for Arabic Video Text Detection, Localization, Extraction and Recognition , 2010, PCM.

[3]  Muhammad Imran Razzak,et al.  Evaluation of cursive and non-cursive scripts using recurrent neural networks , 2015, Neural Computing and Applications.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Faisal Shafait,et al.  A Multi-faceted OCR Framework for Artificial Urdu News Ticker Text Recognition , 2018, 2018 13th IAPR International Workshop on Document Analysis Systems (DAS).

[6]  Christophe Garcia,et al.  Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[7]  Wayne Niblack,et al.  An introduction to digital image processing , 1986 .

[8]  Imran Siddiqi,et al.  Offline cursive Urdu-Nastaliq script recognition using multidimensional recurrent neural networks , 2016, Neurocomputing.

[9]  Saad Bin Ahmed,et al.  Offline Printed Urdu Nastaleeq Script Recognition with Bidirectional LSTM Networks , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[10]  Sarmad Hussain,et al.  Nastalique segmentation-based approach for Urdu OCR , 2015, International Journal on Document Analysis and Recognition (IJDAR).

[11]  Faisal Shafait,et al.  A segmentation-free approach to Arabic and Urdu OCR , 2013, Electronic Imaging.

[12]  Jean-Michel Jolion,et al.  Extraction and recognition of artificial text in multimedia documents , 2003, Formal Pattern Analysis & Applications.

[13]  Xiaojie Wang,et al.  Offline Urdu Nastaleeq optical character recognition based on stacked denoising autoencoder , 2017, China Communications.

[14]  Sarmad Hussain,et al.  Framework of Urdu Nastalique Optical Character Recognition System , 2014 .

[15]  Muhammad Imran Razzak,et al.  Urdu Nasta’liq text recognition system based on multi-dimensional recurrent neural network and statistical features , 2017, Neural Computing and Applications.

[16]  Shehzad Khalid,et al.  Segmentation-free optical character recognition for printed Urdu text , 2017, EURASIP J. Image Video Process..

[17]  Imran Siddiqi,et al.  Urdu Caption Text Detection using Textural Features , 2018, MedPRAI '18.

[18]  Imran Siddiqi,et al.  Edge-Based Features for Localization of Artificial Urdu Text in Video Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[19]  Imran Siddiqi,et al.  Multilingual artificial text detection and extraction from still images , 2013, Electronic Imaging.

[20]  Adel M. Alimi,et al.  NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR , 2012, ArXiv.

[21]  Sarmad Hussain,et al.  Segmentation Based Urdu Nastalique OCR , 2013, CIARP.

[22]  Imran Siddiqi,et al.  Classification of Urdu Ligatures Using Convolutional Neural Networks - A Novel Approach , 2017, 2017 International Conference on Frontiers of Information Technology (FIT).

[23]  Shehzad Khalid,et al.  Recognition of Urdu ligatures - a holistic approach , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[24]  Sarmad Hussain,et al.  Segmentation Free Nastalique Urdu OCR , 2010 .

[25]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[26]  Matti Pietikäinen,et al.  Adaptive document binarization , 1997, Proceedings of the Fourth International Conference on Document Analysis and Recognition.

[27]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[28]  Guang Liu,et al.  Ligature based Urdu Nastaleeq sentence recognition using gated bidirectional long short term memory , 2017, Cluster Computing.