A Method of Text Segmentation from Scanned Image with Complex Background

With the development of information technology, the number of scanned images is increasing rapidly. There are many important texts in these images. In order to satisfy the need of images viewing, text identification and text retrieval, this paper presents an efficient method for text segmentation. Firstly, localizes the text blocks in scanned image. Secondly, according to its gray/color distribution, the text block image is decomposed into text sub-layer, background sub-layers and mixed sub-layers which contain both texts and backgrounds. Finally, the backgrounds are filtered out from these sub-layers, and the combination of texts in all remained sub-layers is the text segmentation result. Experimental results show that the proposed method is robust to overlapped complex background. Keywordsscanned image; text localization; text segmentation

[1]  Chang Yan Chi,et al.  Automatic text extraction from color image , 2000, Visual Communications and Image Processing.

[2]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[3]  YE Qing-tai Text segmentation in complex color images , 2006 .

[4]  Huang Xiang Character-Localization in DCT-Compressed Domain , 2002 .

[5]  Zhao Yang,et al.  An Efficient Method for Text Location and Segmentation , 2009, 2009 WRI World Congress on Software Engineering.

[6]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[7]  David S. Doermann,et al.  A video text detection system based on automated training , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[8]  Bing-Fei Wu,et al.  Complex document image segmentation using localized histogram analysis with multi-layer matching and clustering , 2004, 2004 IEEE International Conference on Systems, Man and Cybernetics (IEEE Cat. No.04CH37583).

[9]  Mausumi Acharyya,et al.  Document image segmentation using wavelet scale-space features , 2002, IEEE Trans. Circuits Syst. Video Technol..

[10]  Rainer Lienhart,et al.  Localizing and segmenting text in images and videos , 2002, IEEE Trans. Circuits Syst. Video Technol..

[11]  Wen Gao,et al.  Multi-polarity text segmentation using graph theory , 2008, 2008 15th IEEE International Conference on Image Processing.