Scene text extraction in natural scene images using hierarchical feature combining and verification

We propose a method that extracts text regions in natural scene images using low-level image features and that verifies the extracted regions through a high-level text stroke feature. Then the two level features are combined hierarchically. The low-level features are color continuity, gray-level variation and color variance. The color continuity is used since most of the characters in a text region have the same color, and the gray-level variation is used since the text strokes are distinctive to the background in their gray-level values. Also, the color variance is used since the text strokes are distinctive in their colors to the background, and this value is more sensitive than the gray-level variations. As a high level feature, text stroke is examined using multi-resolution wavelet transforms on local image areas and the feature vector is input to a SVM (support vector machine) for verification. We tested the proposed method with various kinds of the natural scene images and confirmed that extraction rates are high even in complex images.

[1]  Anil K. Jain,et al.  Automatic Caption Localization in Compressed Video , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  Anil K. Jain,et al.  Locating text in complex color images , 1995, Pattern Recognit..

[3]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[4]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[6]  Chuang Li,et al.  Automatic text location in natural scene images , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).