Arabic text detection in videos using neural and boosting-based approaches: Application to video indexing

Text detection in videos is a primary step in any semantic-based video analysis systems. In this work, we propose and compare three machine learning-based methods for embedded Arabic text detection. These methods are able to detect Arabic text regions without any prior knowledge and without any pre-processing. The first method relies on a convolution neural network. The two other methods are based on a multi-exit asymmetric boosting cascade. The proposed methods have been extensively evaluated on a large database of Arabic TV channel videos. Experiments highlight a good detection rate of all methods even though neural network-based method outperforms the other ones in terms of recall/precision and computation time.

[1]  Ioannis Pratikakis,et al.  A two-stage scheme for text detection in video images , 2010, Image Vis. Comput..

[2]  Tat-Jen Cham,et al.  Detection with multi-exit asymmetric boosting , 2008, 2008 IEEE Conference on Computer Vision and Pattern Recognition.

[3]  Christophe Garcia,et al.  Convolutional face finder: a neural architecture for fast and robust face detection , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Christophe Garcia,et al.  text Detection with Convolutional Neural Networks , 2008, VISAPP.

[5]  Saeed Mozaffari,et al.  Farsi/Arabic text extraction from video images by corner detection , 2010, 2010 6th Iranian Conference on Machine Vision and Image Processing.

[6]  Hang Joon Kim,et al.  Support vector machine-based text detection in digital video , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[7]  Shengcai Liao,et al.  Face Detection Based on Multi-Block LBP Representation , 2007, ICB.

[8]  A. Chilambuchelvan,et al.  Scene Text Extraction from Videos Using Hybrid Approach , 2012, ACITY.

[9]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[10]  Paul A. Viola,et al.  Rapid object detection using a boosted cascade of simple features , 2001, Proceedings of the 2001 IEEE Computer Society Conference on Computer Vision and Pattern Recognition. CVPR 2001.

[11]  Adel M. Alimi,et al.  NF-SAVO: Neuro-Fuzzy system for Arabic Video OCR , 2012, ArXiv.

[12]  Jean-Michel Jolion,et al.  Object count/area graphs for the evaluation of object detection and segmentation algorithms , 2006, International Journal of Document Analysis and Recognition (IJDAR).

[13]  Chew Lim Tan,et al.  Ieee Transactions on Pattern Analysis and Machine Intelligence, Manuscript Id a Laplacian Approach to Multi-oriented Text Detection in Video , 2022 .

[14]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[15]  Ioannis Pratikakis,et al.  Multiresolution text detection in video frames , 2007, VISAPP.

[16]  Ashraf M. A. Ahmad,et al.  A Robust Algorithm for Arabic Video Text Detection , 2012 .