Video OCR: indexing digital news libraries by recognition of superimposed captions

Abstract. The automatic extraction and recognition of news captions and annotations can be of great help locating topics of interest in digital news video libraries. To achieve this goal, we present a technique, called Video OCR (Optical Character Reader), which detects, extracts, and reads text areas in digital video data. In this paper, we address problems, describe the method by which Video OCR operates, and suggest applications for its use in digital news archives. To solve two problems of character recognition for videos, low-resolution characters and extremely complex backgrounds, we apply an interpolation filter, multi-frame integration and character extraction filters. Character segmentation is performed by a recognition-based segmentation method, and intermediate character recognition results are used to improve the segmentation. We also include a method for locating text areas using text-like properties and the use of a language-based postprocessing technique to increase word recognition rates. The overall recognition results are satisfactory for use in news indexing. Performing Video OCR on news video and combining its results with other video understanding techniques will improve the overall understanding of the news video content.

[1]  Michael A. Smith,et al.  Video skimming and characterization through the combination of image and language understanding techniques , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[2]  Takeo Kanade,et al.  Reconstructing 3-D Blood Vessel Shapes from Multiple X-Ray Images , 1994 .

[3]  Shoji Kurakake,et al.  Recognition and visual feature matching of text region in video for conceptual indexing , 1997, Electronic Imaging.

[4]  Rainer Lienhart,et al.  Automatic text recognition for video indexing , 1997, MULTIMEDIA '96.

[5]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[6]  M. D. Glick University computing services in 1995 , 1989, SIGU.

[7]  Qian Huang,et al.  Character extraction of license plates from video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Roberto Brunelli,et al.  MASSACHUSETTS INSTITUTE OF TECHNOLOGY ARTIFICIAL INTELLIGENCE LABORATORY and CENTER FOR BIOLOGICAL AND COMPUTATIONAL LEARNING DEPARTMENT OF BRAIN AND COGNITIVE SCIENCES , 2001 .

[9]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[10]  Takeo Kanade,et al.  Intelligent Access to Digital Video: Informedia Project , 1996, Computer.

[11]  Takeo Kanade,et al.  Semantic analysis for video contents extraction—spotting by association in news video , 1997, MULTIMEDIA '97.

[12]  Yi Lu,et al.  Machine printed character segmentation --; An overview , 1995, Pattern Recognit..

[13]  Daniel P. Lopresti,et al.  OCR for World Wide Web images , 1997, Electronic Imaging.

[14]  Seong-Whan Lee,et al.  A new methodology for gray-scale character segmentation and recognition , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[15]  Seong-Whan Lee,et al.  A New Methodology for Gray-Scale Character Segmentation and Recognition , 1996, IEEE Trans. Pattern Anal. Mach. Intell..

[16]  Takeo Kanade,et al.  Name-It: association of face and name in video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[17]  Patrick A. V. Hall,et al.  Approximate String Matching , 1994, Encyclopedia of Algorithms.

[18]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.