Accurate Overlay Text Extraction for Digital Video Analysis

This report describes a system to detect and extract the overlay texts in digital video. Different from the previous approaches, the system used a multiple hypothesis testing approach: The region-of-interests (ROI) probably containing the overlay texts are decomposed into several hypothetical binary images using color space partitioning; A grouping algorithm then is conducted to group the identified character blocks into text lines in each binary image; If the layout of the grouped text lines conforms to the verification rules, the bounding boxes of these grouped blocks are output as the detected text regions. Finally, motion verification is used to reduce false alarms. In order to achieve real time speed, ROI localization is realized using compressed domain features including DCT coefficients and motion vectors in MPEG videos. The proposed method showed impressive results with average recall 96.9% and precision 71.6% in testing on digital News videos.

[1]  Chitra Dorai,et al.  Automatic text extraction from video for content-based annotation and retrieval , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[2]  Alberto Del Bimbo,et al.  Automatic caption localization in videos using salient points , 2001, IEEE International Conference on Multimedia and Expo, 2001. ICME 2001..

[3]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[4]  Wolfgang Effelsberg,et al.  Automatic text segmentation and text recognition for video indexing , 2000, Multimedia Systems.

[5]  John R. Smith,et al.  Improved text overlay detection in videos using a fusion-based classifier , 2003, 2003 International Conference on Multimedia and Expo. ICME '03. Proceedings (Cat. No.03TH8698).

[6]  Shih-Fu Chang,et al.  General and domain-specific techniques for detecting and recognizing superimposed text in video , 2002, Proceedings. International Conference on Image Processing.

[7]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[8]  Takeo Kanade,et al.  Video OCR: indexing digital news libraries by recognition of superimposed captions , 1999, Multimedia Systems.

[9]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).