Color text image binarization based on binary texture analysis

In this paper, a novel binarization algorithm for color text images is presented. This algorithm effectively integrates color clustering and binary texture analysis, and is capable of handling situations with complex backgrounds. In this algorithm, dimensionality reduction and graph theoretical clustering are first employed. As a result, binary images related to clusters can be obtained. Binary texture analysis is then performed on each candidate binary image. Two kinds of effective texture features, run-length histogram and spatial-size distribution related, respectively, are extracted and explored. Cooperating with a linear discriminant analysis classifier, the optimal candidate for the best binarization effect is obtained. Experiments with images collected from the Internet have been carried out and compared with existing techniques. Both show the effectiveness of the algorithm.

[1]  James M. Rehg,et al.  Statistical Color Models with Application to Skin Detection , 2004, International Journal of Computer Vision.

[2]  Sargur N. Srihari,et al.  Document Image Binarization Based on Texture Features , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[3]  P.K Sahoo,et al.  A survey of thresholding techniques , 1988, Comput. Vis. Graph. Image Process..

[4]  Pierre Wellner,et al.  Adaptive Thresholding for the DigitalDesk , 1993 .

[5]  Guillermo Ayala,et al.  Spatial Size Distributions: Applications to Shape and Texture Analysis , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  Jiri Matas,et al.  Spatial and Feature Space Clustering: Applications in Image Analysis , 1995, CAIP.

[7]  Petros Maragos,et al.  Pattern Spectrum and Multiscale Shape Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..