Caption Localization and Detection for News Videos Using Frequency Analysis and Wavelet Features

In this paper, we propose an algorithm to detect captions from news videos. The propose method only detects captions excluding other miscellaneous types of text. The algorithm makes use of the fact that the text remains in many consecutive frames to reduce the number of the processing frames. The caption beginning frame is detected first, then a caption candidate strip in the caption beginning frame is defined. Moreover, the difference of the caption candidate strip between consecutive frames is computed, and then the difference information is transformed to frequency domain by discrete cosine transform. Frequency analysis is used to define the caption candidate region, and twelve wavelet features are extracted from the region and considered as the input of the classifier to detect the text blocks. Experimental results show that the proposed approach can fast and robustly detect captions from news video.

[1]  Michael R. Lyu,et al.  A comprehensive method for multilingual video text detection, localization, and extraction , 2005, IEEE Transactions on Circuits and Systems for Video Technology.

[2]  Xinbo Gao,et al.  A spatial-temporal approach for video caption detection and recognition , 2002, IEEE Trans. Neural Networks.

[3]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[4]  Tianshun Yao,et al.  An evaluation of statistical spam filtering techniques , 2004, TALIP.

[5]  Rainer Lienhart,et al.  Localizing and segmenting text in images and videos , 2002, IEEE Trans. Circuits Syst. Video Technol..

[6]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[7]  Anil K. Jain,et al.  Text information extraction in images and video: a survey , 2004, Pattern Recognit..

[8]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[9]  Josef Kittler,et al.  Floating search methods for feature selection with nonmonotonic criterion functions , 1994, Proceedings of the 12th IAPR International Conference on Pattern Recognition, Vol. 3 - Conference C: Signal Processing (Cat. No.94CH3440-5).

[10]  Jiang Yuan,et al.  Modulation classification of communication signals , 2004, IEEE MILCOM 2004. Military Communications Conference, 2004..

[11]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.