Accurate video text detection through classification of low and high contrast images

Detection of both scene text and graphic text in video images is gaining popularity in the area of information retrieval for efficient indexing and understanding the video. In this paper, we explore a new idea of classifying low contrast and high contrast video images in order to detect accurate boundary of the text lines in video images. In this work, high contrast refers to sharpness while low contrast refers to dim intensity values in the video images. The method introduces heuristic rules based on combination of filters and edge analysis for the classification purpose. The heuristic rules are derived based on the fact that the number of Sobel edge components is more than the number of Canny edge components in the case of high contrast video images, and vice versa for low contrast video images. In order to demonstrate the use of this classification on video text detection, we implement a method based on Sobel edges and texture features for detecting text in video images. Experiments are conducted using video images containing both graphic text and scene text with different fonts, sizes, languages, backgrounds. The results show that the proposed method outperforms existing methods in terms of detection rate, false alarm rate, misdetection rate and inaccurate boundary rate.

[1]  Hao Yan,et al.  Automatic Text Detection In Video Frames Based on Bootstrap Artificial Neural Network and CED , 2003, WSCG.

[2]  Rangachar Kasturi,et al.  Locating uniform-colored text in video frames , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[3]  Xinbo Gao,et al.  A spatial-temporal approach for video caption detection and recognition , 2002, IEEE Trans. Neural Networks.

[4]  Rainer Lienhart,et al.  On the segmentation of text in videos , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[5]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[6]  Anil K. Jain,et al.  Locating text in complex color images , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[7]  Rainer Lienhart,et al.  Localizing and segmenting text in images and videos , 2002, IEEE Trans. Circuits Syst. Video Technol..

[8]  Hang Joon Kim,et al.  Automatic text detection and removal in video sequences , 2003, Pattern Recognit. Lett..

[9]  N. Otsu A threshold selection method from gray level histograms , 1979 .

[10]  Paul Wintz,et al.  Digital image processing (2nd ed.) , 1987 .

[11]  Korris Fu-Lai Chung,et al.  Hybrid Chinese/English text detection in images and video frames , 2002, Object recognition supported by user interaction for service robots.

[12]  Chein-I Chang,et al.  Thresholding Video Images for Text Detection , 2002, ICPR.

[13]  A.A. Alatan,et al.  Automatic Video Text Localization and Recognition , 2006, 2006 IEEE 14th Signal Processing and Communications Applications.

[14]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[15]  Jean-Marc Odobez,et al.  Sequential Monte Carlo video text segmentation , 2003, Proceedings 2003 International Conference on Image Processing (Cat. No.03CH37429).

[16]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[17]  Edward K. Wong,et al.  A new robust algorithm for video text extraction , 2003, Pattern Recognit..

[18]  Chunheng Wang,et al.  Text detection in images based on unsupervised classification of edge-based features , 2005, Eighth International Conference on Document Analysis and Recognition (ICDAR'05).

[19]  Palaiahnakote Shivakumara,et al.  An Efficient Edge Based Technique for Text Detection in Video Frames , 2008, 2008 The Eighth IAPR International Workshop on Document Analysis Systems.

[20]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[21]  Wen Gao,et al.  A robust text detection algorithm in images and video frames , 2003, Fourth International Conference on Information, Communications and Signal Processing, 2003 and the Fourth Pacific Rim Conference on Multimedia. Proceedings of the 2003 Joint.

[22]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[23]  Palaiahnakote Shivakumara,et al.  New Filter Based Unsupervised Rules for Boolean Blur Metric , 2007, 2007 International Conference on Computing: Theory and Applications (ICCTA'07).

[24]  Shih-Fu Chang,et al.  General and domain-specific techniques for detecting and recognizing superimposed text in video , 2002, Proceedings. International Conference on Image Processing.

[25]  Xian-Sheng Hua,et al.  An automatic performance evaluation protocol for video text detection algorithms , 2004, IEEE Transactions on Circuits and Systems for Video Technology.

[26]  Jean-Philippe Thiran,et al.  A localization/verification scheme for finding text in images and video frames based on contrast independent features and machine learning methods , 2004, Signal Process. Image Commun..

[27]  Michael R. Lyu,et al.  A new approach for video text detection , 2002, Proceedings. International Conference on Image Processing.

[28]  Xian-Sheng Hua,et al.  Automatic performance evaluation for video text detection , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[29]  Wen Gao,et al.  Fast and robust text detection in images and video frames , 2005, Image Vis. Comput..

[30]  Jean-Marc Odobez,et al.  Text detection, recognition in images and video frames , 2004, Pattern Recognit..