Text extraction in MPEG compressed video for content-based indexing

Video text extraction is a core technique for multimedia applications such as news-on-demand (NOD) and digital libraries, and research about video text extraction have been conducted vigorously. In this paper, we propose an efficient method for extracting texts in MPEG compressed videos for content-based indexing. The proposed method makes the best use of 2-level DCT coefficients and macroblock type information in MPEG compressed video, and this method can be organized into three stages to increase overall performance; text frame detection, text region extraction, and character extraction. The main advantage of the proposed method is that it can avoid the overhead of decompressing video into individual frames in the pixel domain. We evaluated this method using various types of news video data.

[1]  David S. Doermann,et al.  Automatic identification of text in digital video key frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[2]  Shoji Kurakake,et al.  Recognition and visual feature matching of text region in video for conceptual indexing , 1997, Electronic Imaging.

[3]  Ullas Gargi,et al.  Indexing text events in digital video databases , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[4]  Boon-Lock Yeo,et al.  Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video , 1996, Electronic Imaging.

[5]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.

[6]  M. Smith,et al.  Video Skimming for Quick Browsing based on Audio and Image Characterization , 1995 .