ALIF: A dataset for Arabic embedded text recognition in TV broadcast

This paper proposes a dataset, called ALIF, for Arabic embedded text recognition in TV broadcast. The dataset is publicly available for a non-commercial use. It is composed of a large number of manually annotated text images that were extracted from Arabic TV broadcast. It is the first public dataset dedicated to the development and the evaluation of video Arabic OCR techniques. Text images in the dataset are highly variable in terms of text characteristics (fonts, sizes, colors...) and acquisition conditions (background complexity, low resolution, non-uniform luminosity and contrast...). Moreover, an important part of the dataset is finely annotated, i.e. the text in an image is segmented into characters, paws and words, and each segment is labeled. The dataset can hence be used for both segmentation-based and segmentation-free text recognition techniques. In order to illustrate how the ALIF dataset can be used, the results of an evaluation study that we have conducted on different techniques for Arabic text recognition are also presented.

[1]  Slim Kanoun,et al.  Database for Arabic Printed Text Recognition Research , 2013, ICIAP.

[2]  Xujun Peng,et al.  Text detection and recognition in natural scenes and consumer videos , 2014, 2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[3]  SchmidhuberJürgen,et al.  A Novel Connectionist System for Unconstrained Handwriting Recognition , 2009 .

[4]  Adel M. Alimi,et al.  ICDAR2013 Competition on Multi-font and Multi-size Digitally Represented Arabic Text , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[5]  Jon Almazán,et al.  ICDAR 2013 Robust Reading Competition , 2013, 2013 12th International Conference on Document Analysis and Recognition.

[6]  Ching Y. Suen,et al.  Databases for recognition of handwritten Arabic cheques , 2003, Pattern Recognit..

[7]  Christophe Garcia,et al.  Deep learning and recurrent connectionist-based approaches for Arabic text recognition in videos , 2015, 2015 13th International Conference on Document Analysis and Recognition (ICDAR).

[8]  Christophe Garcia,et al.  Arabic text detection in videos using neural and boosting-based approaches: Application to video indexing , 2014, 2014 IEEE International Conference on Image Processing (ICIP).

[9]  Jin Hyung Kim,et al.  Touch TT: Scene Text Extractor Using Touchscreen Interface , 2011 .

[10]  Geoffrey E. Hinton,et al.  To recognize shapes, first learn to generate images. , 2007, Progress in brain research.

[11]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[12]  Andreas Dengel,et al.  ICDAR 2011 Robust Reading Competition Challenge 2: Reading Text in Scene Images , 2011, 2011 International Conference on Document Analysis and Recognition.

[13]  M. Pechwitz,et al.  IFN/ENIT: database of handwritten arabic words , 2002 .

[14]  Mohamed Cheriet,et al.  IBN SINA: a database for research on processing and understanding of Arabic manuscripts images , 2010, DAS '10.

[15]  Klaus Meyer-Wegener,et al.  NEOCR: A Configurable Dataset for Natural Image Text Recognition , 2011, CBDAR.

[16]  Slim Kanoun,et al.  A Database for Arabic Handwritten Text Image Recognition and Writer Identification , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.

[17]  S. Lucas,et al.  ICDAR 2003 robust reading competitions: entries, results, and future directions , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[18]  Partha Pratim Roy,et al.  ICDAR 2011 Robust Reading Competition - Challenge 1: Reading Text in Born-Digital Images (Web and Email) , 2011, 2011 International Conference on Document Analysis and Recognition.