A Novel Image Text Extraction Method Based on K-Means Clustering

Texts in web pages, images and videos contain important clues for information indexing and retrieval. Most existing text extraction methods depend on the language type and text appearance. In this paper, a novel and universal method of image text extraction is proposed. A coarse-to-fine text location method is implemented. Firstly, a multi-scale approach is adopted to locate texts with different font sizes. Secondly, projection profiles are used in location refinement step. Color-based k-means clustering is adopted in text segmentation. Compared to grayscale image which is used in most existing methods, color image is more suitable for segmentation based on clustering. It treats corner-points, edge-points and other points equally so that it solves the problem of handling multilingual text. It is demonstrated in experimental results that best performance is obtained when k is 3. Comparative experimental results on a large number of images show that our method is accurate and robust in various conditions.

[1]  Irwin Sobel,et al.  An Isotropic 3×3 image gradient operator , 1990 .

[2]  Ellen K. Hughes,et al.  Video OCR for Digital News Archives , 1998 .

[3]  Edward M. Riseman,et al.  TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[4]  Ellen K. Hughes,et al.  Video OCR for digital news archive , 1998, Proceedings 1998 IEEE International Workshop on Content-Based Access of Image and Video Database.

[5]  Rainer Lienhart,et al.  On the segmentation of text in videos , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[6]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..

[7]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[8]  Clement T. Yu,et al.  Techniques and Systems for Image and Video Retrieval , 1999, IEEE Trans. Knowl. Data Eng..

[9]  Xian-Sheng Hua,et al.  A video text detection and recognition system , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[10]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[11]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[12]  David J. Crandall,et al.  Robust extraction of text in video , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[13]  Jin Hyung Kim,et al.  Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[14]  Lei Gao,et al.  An Automatic Video Text Detection, Localization and Extraction Approach , 2006, SITIS.