Color text extraction with selective metric-based clustering

Natural scene images usually contain varying colors which make segmentation more difficult. Without any a priori knowledge of degradations and based on physical light reflectance, we propose a selective metric-based clustering to extract textual information in real-world images. The proposed method uses several metrics to merge similar color together for an efficient text-driven segmentation in the RGB color space. However, color information by itself is not sufficient to solve all natural scene issues; hence we complement it with intensity and spatial information obtained using Log-Gabor filters, thus enabling the processing of character segmentation into individual components to increase final recognition rates. Hence, our selective metric-based clustering is integrated into a dynamic method suitable for text extraction and character segmentation. Quantitative results on a public database are presented to assess the efficiency and the complementarity of metrics, together with the importance of a dynamic system for natural scene text extraction. Finally running time is detailed to show the usability of our method.

[1]  Gaurav Sharma Digital Color Imaging Handbook , 2002 .

[2]  Satoshi Goto,et al.  A robust algorithm for text detection in color images , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[3]  Rastislav Lukac,et al.  Generalized Selection Weighted Vector Filters , 2004, EURASIP J. Adv. Signal Process..

[4]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[5]  Theo Gevers,et al.  Color in Image Databases , 1999 .

[6]  Majid Mirmehdi,et al.  Super-Resolution Text using the Teager Filter , 2005 .

[7]  S. Lucas,et al.  ICDAR 2003 robust reading competitions: entries, results, and future directions , 2005, International Journal of Document Analysis and Recognition (IJDAR).

[8]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[9]  Bernard Gosselin,et al.  An Embedded Application for Degraded Text Recognition , 2005, EURASIP J. Adv. Signal Process..

[10]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[11]  Chein-I Chang,et al.  Unsupervised approach to color video thresholding , 2004 .

[12]  Andreas Koschan,et al.  Colour Image Segmentation: A Survey , 1994 .

[13]  Slawomir Wesolkowski,et al.  Color edge detection in RGB using jointly euclidean distance and vector angle , 1999 .

[14]  Bernard Gosselin,et al.  Color text extraction from camera-based images: the impact of the choice of the clustering distance , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[15]  K. Martin,et al.  Vector filtering for color imaging , 2005, IEEE Signal Processing Magazine.

[16]  Apostolos Antonacopoulos,et al.  Text extraction from Web images based on a split-and-merge segmentation method using colour perception , 2004, ICPR 2004.

[17]  Steven A. Shafer,et al.  Using color to separate reflection components , 1985 .

[18]  Rastislav Lukac,et al.  A Taxonomy of Color Image Filtering and Enhancement Solutions , 2006 .

[19]  Alex Waibel,et al.  Text Detection and Translation from Natural Scenes , 2001 .

[20]  Bernard Gosselin,et al.  Color binarization for complex camera-based images , 2005, IS&T/SPIE Electronic Imaging.

[21]  Bin Wang,et al.  Color text image binarization based on binary texture analysis , 2004, 2004 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[22]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[23]  Simon M. Lucas,et al.  Web-Based Deployment of Text Locating Algorithms , 2005 .

[24]  Slawo Wesolkowski Shading and Highlight Invariant Color Image Segmentation , 2001 .

[25]  C. Garcia,et al.  Text detection and segmentation in complex color images , 2000, 2000 IEEE International Conference on Acoustics, Speech, and Signal Processing. Proceedings (Cat. No.00CH37100).

[26]  David J. Crandall,et al.  Extraction of special effects caption text events from digital video , 2003, International Journal on Document Analysis and Recognition.

[27]  Michael Hild Color similarity measures for efficient color classification , 2004 .

[28]  S. J. Perantonis,et al.  Detection in Indoor / Outdoor Scene Images , 2005 .

[29]  Ching Y. Suen,et al.  Character string extraction from color documents , 2001, Pattern Recognit..

[30]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[31]  Bui Tuong Phong Illumination for computer generated pictures , 1975, Commun. ACM.