Arabic newspaper page segmentation

The aim of layout analysis is to extract the geometricstructure from a document image. It consists of labelinghomogenous regions of a document image. This paperdescribes the performance of segmentation algorithmsand their adaptation in order to treat complex structuredArabic documents such as newspapers. Experimentaltests have been carried out on four different phases ofnewspaper image analysis: thread recognition, framerecognition, image text separation, text line recognition,and line merging into blocks. Some promisingexperimental results are reported.

[1]  George Nagy,et al.  Twenty Years of Document Image Analysis in PAMI , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Karim Hadjar,et al.  Configuration REcognition Model for Complex Reverse Engineering Methods: 2(CREM) , 2002, Document Analysis Systems.

[3]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[4]  Robert M. Haralick,et al.  Document image understanding: geometric and logical layout , 1994, 1994 Proceedings of IEEE Conference on Computer Vision and Pattern Recognition.

[5]  Stavros J. Perantonis,et al.  Integrated algorithms for newspaper page decomposition and article tracking , 1999, Proceedings of the Fifth International Conference on Document Analysis and Recognition. ICDAR '99 (Cat. No.PR00318).

[6]  Basilios Gatos,et al.  First International Newspaper Segmentation contest , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Proceedings Seventh International Conference on Document Analysis and Recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[8]  Daniel P. Lopresti,et al.  Why table ground-truthing is hard , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[9]  Karim Hadjar,et al.  Newspaper page decomposition using a split and merge approach , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.