论文信息 - Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos

Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos

This paper focuses on recognizing Arabic embedded text in videos. The proposed methods proceed without applying any prior pre-processing operations or character segmentation. Difficulties related to the video or text properties are faced using a learned robust representation of the input text image. This is performed using Convolutional Neural Networks and Deep Auto-Encoders. Features are computed using a multi-scale sliding window scheme. A connectionist recurrent approach is then used. It is trained to predict correct transcriptions of the input image from the associated sequence of features. Proposed methods are extensively evaluated on a large video database recorded from several Arabic TV channels.

[1] Mohammad S. Khorsheed,et al. Off-Line Arabic Character Recognition – A Review , 2002, Pattern Analysis & Applications.

[2] Garrison W. Cottrell,et al. Non-Linear Dimensionality Reduction , 1992, NIPS.

[3] Haikal El Abed,et al. Guide to OCR for Arabic Scripts , 2012, Springer London.

[4] Frank Fallside,et al. Off-line Handwriting Recognition by Recurrent Error Propagation Networks , 1992 .

[5] Venu Govindaraju,et al. Multilingual OCR research and applications: an overview , 2013, MOCR '13.

[6] Jürgen Schmidhuber,et al. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.

[7] Lawrence R. Rabiner,et al. A tutorial on hidden Markov models and selected applications in speech recognition , 1989, Proc. IEEE.

[8] Osama Al-Khaleel,et al. Automated System for Arabic Optical Character Recognition with Lookup Dictionary , 2012 .

[9] Adel M. Alimi,et al. NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR , 2012, ArXiv.

[10] Ramin Mehran,et al. A Front-End OCR for Omni-Font Persian/Arabic Cursive Printed Documents , 2005, Digital Image Computing: Techniques and Applications (DICTA'05).

[11] Pascale Sébillot,et al. Text recognition in multimedia documents: a study of two neural-based OCRs using and avoiding character segmentation , 2013, International Journal on Document Analysis and Recognition (IJDAR).

[12] Albert Gordo,et al. Label Embedding: A Frugal Baseline for Text Recognition , 2015, International Journal of Computer Vision.

[13] Andrew Zisserman,et al. Reading Text in the Wild with Convolutional Neural Networks , 2014, International Journal of Computer Vision.

[14] Gernot A. Fink,et al. Novel Sub-character HMM Models for Arabic Text Recognition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[15] Christophe Garcia,et al. Arabic text detection in videos using neural and boosting-based approaches: Application to video indexing , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[16] J. Schmidhuber,et al. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS 2008.

[17] Z. Shaaban. A New Recognition Scheme for Machine- Printed Arabic Texts based on Neural Networks , 2008 .

[18] Mahmoud I. Khalil,et al. A Database for Arabic Printed Character Recognition , 2008, ICIAR.

[19] Geoffrey E. Hinton,et al. To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[20] Yoshua Bengio,et al. Convolutional networks for images, speech, and time series , 1998 .