Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos

This paper focuses on recognizing Arabic embedded text in videos. The proposed methods proceed without applying any prior pre-processing operations or character segmentation. Difficulties related to the video or text properties are faced using a learned robust representation of the input text image. This is performed using Convolutional Neural Networks and Deep Auto-Encoders. Features are computed using a multi-scale sliding window scheme. A connectionist recurrent approach is then used. It is trained to predict correct transcriptions of the input image from the associated sequence of features. Proposed methods are extensively evaluated on a large video database recorded from several Arabic TV channels.

[1]  Mohammad S. Khorsheed,et al.  Off-Line Arabic Character Recognition – A Review , 2002, Pattern Analysis & Applications.

[2]  Garrison W. Cottrell,et al.  Non-Linear Dimensionality Reduction , 1992, NIPS.

[3]  Haikal El Abed,et al.  Guide to OCR for Arabic Scripts , 2012, Springer London.

[4]  Frank Fallside,et al.  Off-line Handwriting Recognition by Recurrent Error Propagation Networks , 1992 .

[5]  Venu Govindaraju,et al.  Multilingual OCR research and applications: an overview , 2013, MOCR '13.

[6]  Jürgen Schmidhuber,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[7]  Lawrence R. Rabiner,et al.  A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8]  Osama Al-Khaleel,et al.  Automated System for Arabic Optical Character Recognition with Lookup Dictionary , 2012 .

[9]  Adel M. Alimi,et al.  NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR , 2012, ArXiv.

[10]  Ramin Mehran,et al.  A Front-End OCR for Omni-Font Persian/Arabic Cursive Printed Documents , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[11]  Pascale Sébillot,et al.  Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[12]  Albert Gordo,et al.  Label Embedding: A Frugal Baseline for Text Recognition , 2015, International Journal of Computer Vision.

[13]  Andrew Zisserman,et al.  Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[14]  Gernot A. Fink,et al.  Novel Sub-character HMM Models for Arabic Text Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[15]  Christophe Garcia,et al.  Arabic text detection in videos using neural and boosting-based approaches: Application to video indexing , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[16]  J. Schmidhuber,et al.  Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS 2008.

[17]  Z. Shaaban A New Recognition Scheme for Machine- Printed Arabic Texts based on Neural Networks , 2008 .

[18]  Mahmoud I. Khalil,et al.  A Database for Arabic Printed Character Recognition , 2008, ICIAR.

[19]  Geoffrey E. Hinton,et al.  To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[20]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .