Accurate text localization in images based on SVM output scores

In this paper, we propose a new approach for accurate text localization in images based on SVM (support vector machine) output scores. In general, SVM output scores for the verification of text candidates provide a measure of the closeness to the text. Up to the present, most researchers used the score for verifying the text candidate region whether it is text or not. However, we use the output score for refining the initial localized text lines and selecting the best localization result from the different pyramid levels. By means of the proposed approach, we can obtain more accurate text localization results. Our method has three modules: (1) text candidate detection based on edge-CCA (connected component analysis), (2) text candidate verification based on the classifier fusion of N-gray (normalized gray intensity) and CGV (constant gradient variance), and (3) text line refinement based on the SVM output score, color distribution and prior geometric knowledge. By means of experiments on a large news database, we demonstrate that our method achieves impressive performance with respect to the accuracy, robustness and efficiency.

[1]  Jean-Michel Jolion,et al.  Extraction and recognition of artificial text in multimedia documents , 2003, Formal Pattern Analysis & Applications.

[2]  Akhil Sahai,et al.  Web E-Speak: Facilitating Web-Based E-Services , 2002, IEEE Multim..

[3]  Alan L. Yuille,et al.  Detecting and reading text in natural scenes , 2004, CVPR 2004.

[4]  Takeo Kanade,et al.  Video OCR: indexing digital news libraries by recognition of superimposed captions , 1999, Multimedia Systems.

[5]  David S. Doermann,et al.  Machine printed text and handwriting identification in noisy document images , 2004, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[6]  Wenwen Li,et al.  Character segmentation of color images from digital camera , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[7]  Jin Hyung Kim,et al.  Texture-Based Approach for Text Detection in Images Using Support Vector Machines and Continuously Adaptive Mean Shift Algorithm , 2003, IEEE Trans. Pattern Anal. Mach. Intell..

[8]  Hang Joon Kim,et al.  Support vector machine-based text detection in digital video , 2000, Neural Networks for Signal Processing X. Proceedings of the 2000 IEEE Signal Processing Society Workshop (Cat. No.00TH8501).

[9]  Qifeng Liu,et al.  SVM output score based text line refinement for accurate text localization , 2008, 2008 IEEE International Conference on Acoustics, Speech and Signal Processing.

[10]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.

[11]  Masatoshi Kimachi,et al.  Using Adaboost to Detect and Segment Characters from Natural Scenes , 2005 .

[12]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[13]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[14]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[15]  Dorin Comaniciu,et al.  Real-time tracking of non-rigid objects using mean shift , 2000, Proceedings IEEE Conference on Computer Vision and Pattern Recognition. CVPR 2000 (Cat. No.PR00662).

[16]  Wolfgang Effelsberg,et al.  Automatic text segmentation and text recognition for video indexing , 2000, Multimedia Systems.

[17]  Zhu Liu,et al.  Multimedia content analysis-using both audio and visual clues , 2000, IEEE Signal Process. Mag..

[18]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..

[19]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[20]  Rainer Lienhart,et al.  Automatic text recognition for video indexing , 1997, MULTIMEDIA '96.

[21]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[22]  Avideh Zakhor,et al.  Applications of Video-Content Analysis and Retrieval , 2002, IEEE Multim..

[23]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[24]  Lina J. Karam,et al.  Morphological text extraction from images , 2000, IEEE Trans. Image Process..

[25]  Edward M. Riseman,et al.  TextFinder: An Automatic System to Detect and Recognize Text In Images , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[26]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[27]  Fu Chang,et al.  Caption analysis and recognition for building video indexing systems , 2004, Multimedia Systems.

[28]  Xian-Sheng Hua,et al.  Automatic location of text in video frames , 2001, MULTIMEDIA '01.