A novel video text extraction approach based on Log-Gabor filters

Video text brings important semantic clues about video content. Text extraction is a crucial stage of analyzing the video text. Most of papers perform video text extraction using stroke, intensity features, which are sensitive to the video background. Text character extraction is difficult due to the complex background of video frames. The structure of text characters exists in its phase information, which is insensitive to the background. The phase map is first generated to retrieve the phase angle image using Log-Gabor filter. Then, we propose a novel text extraction algorithm based on phase map. First, for the text row in single frame, we retrieve the phase map. Second, we perform k-means clustering in the phase map and select one clustering result as the text character image. Third, in 30 consecutive frames which contain same text character, we select the four text character images and combine them into one binary character image. Finally we use the dam point labeling and inward filling [1] to remove some noise and get the binary character image. Experimental results show that this approach is robust and can be effectively applied to text extraction in video.

[1]  Frank Lebourgeois,et al.  Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[2]  Datong Chen,et al.  Text enhancement with asymmetric filter for video OCR , 2001, Proceedings 11th International Conference on Image Analysis and Processing.

[3]  Hubert Emptoz,et al.  Serialized unsupervised classifier for adaptative color image segmentation: application to digitized ancient manuscripts , 2004, ICPR 2004.

[4]  Yan Chen,et al.  Comparison of some thresholding algorithms for text/background segmentation in difficult document images , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[5]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[6]  Stephen M. Watt,et al.  Hybrid Mathematical Symbol Recognition Using Support Vector Machines , 2007 .

[7]  Songfeng Lu,et al.  A density-based approach for text extraction in images , 2008, 2008 19th International Conference on Pattern Recognition.

[8]  Wen Gao,et al.  Multi-polarity text segmentation using graph theory , 2008, 2008 15th IEEE International Conference on Image Processing.

[9]  Wen Gao,et al.  A hybrid text segmentation approach , 2009, 2009 IEEE International Conference on Multimedia and Expo.

[10]  Xinbo Gao,et al.  A spatial-temporal approach for video caption detection and recognition , 2002, IEEE Trans. Neural Networks.

[11]  Stéphane Mallat,et al.  A Theory for Multiresolution Signal Decomposition: The Wavelet Representation , 1989, IEEE Trans. Pattern Anal. Mach. Intell..

[12]  D J Field,et al.  Relations between the statistics of natural images and the response properties of cortical cells. , 1987, Journal of the Optical Society of America. A, Optics and image science.

[13]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[14]  John S. Boreczky,et al.  A hidden Markov model framework for video segmentation using audio and image features , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[15]  Bernard Gosselin,et al.  Color text extraction from camera-based images: the impact of the choice of the clustering distance , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[16]  Changsheng Xu,et al.  Video Clock Time Reconition Based on Temporal Periodic Pattern Change of the Digit Characters , 2006, 2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.

[17]  Changsheng Xu,et al.  Reliable Video Clock Time Recognition , 2006, 18th International Conference on Pattern Recognition (ICPR'06).