Automatic caption localization in compressed video

We present a method to automatically locate captions in MPEG video. Caption text regions are segmented from the background using their distinguishing texture characteristics. This method first locates candidate text regions directly in the DCT compressed domain, and then reconstructs the candidate regions for further refinement in the spatial domain. Therefore, only a small amount of decoding is required. The proposed algorithm achieves about 4.0% false reject rate and less than 5.7% false positive rate on a variety of MPEG compressed video containing more than 42,000 frames.

[1]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[2]  Stephen W. Smoliar,et al.  Developing power tools for video indexing and retrieval , 1994, Electronic Imaging.

[3]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[4]  Scott Stevens,et al.  Informedia digital video library , 1994, MULTIMEDIA '94.

[5]  Alexander G. Hauptmann,et al.  Text, Speech, and Vision for Video Segmentation: The InformediaTM Project , 1995 .

[6]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[7]  Ullas Gargi,et al.  Indexing text events in digital video databases , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[8]  K WallaceGregory The JPEG still picture compression standard , 1991 .

[9]  Seong-Whan Lee,et al.  A New Methodology for Gray-Scale Character Segmentation and Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Gerard Salton,et al.  Research and Development in Information Retrieval , 1982, Lecture Notes in Computer Science.

[11]  Chitra Dorai,et al.  Automatic text extraction from video for content-based annotation and retrieval , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[12]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[13]  I. K. Sethi,et al.  Convolution-Based Edge Detection for Image/Video in Block DCT Domain , 1996, J. Vis. Commun. Image Represent..

[14]  D. Legall,et al.  MPEG : A video compression standard for multimedia applications , 1991 .

[15]  Ramesh Jain,et al.  Storage and Retrieval for Image and Video Databases III , 1995 .

[16]  Anil K. Jain,et al.  Page segmentation using tecture analysis , 1996, Pattern Recognit..

[17]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.

[18]  Boon-Lock Yeo,et al.  Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video , 1996, Electronic Imaging.

[19]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[20]  Takeo Kanade,et al.  Intelligent Access to Digital Video: Informedia Project , 1996, Computer.

[21]  Didier Le Gall,et al.  MPEG: a video compression standard for multimedia applications , 1991, CACM.

[22]  Anil K. Jain,et al.  Locating text in complex color images , 1995, Pattern Recognit..

[23]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[24]  Anil K. Jain,et al.  Locating text in complex color images , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[25]  Nilesh V. Patel,et al.  Statistical approach to scene change detection , 1995, Electronic Imaging.