Automatic text recognition for video indexing

Efficient indexing and retrieval of digital video is an important aspect of video databases. One powerful index for retrieval is the text appearing in them. It enables content- based browsing. We present our methods for automatic segmentation and recognition of text in digital videos. The algorithms we propose make use of typical characteristics of text in videos in order to enable and enhance segmentation and recognition performance. Especially the inter-frame dependencies of the characters provide new possibilities for their refinement. Then, a straightforward indexing and retrieval scheme is introduced. It is used in the experiments to demonstrate that the proposed text segmentation and text recognition algorithms are suitable for indexing and retrieval of relevant video scenes in and from a video data base. Our experimental results are very encouraging and suggest that these algorithms can be used in video retrieval applications as well as to recognize higher semantics in video.

[1]  Peter E. Hart,et al.  Nearest neighbor pattern classification , 1967, IEEE Trans. Inf. Theory.

[2]  Theodosios Pavlidis,et al.  Picture Segmentation by a Tree Traversal Algorithm , 1976, JACM.

[3]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[4]  Niklaus Wirth,et al.  Algorithms & data structures , 1985 .

[5]  Niklaus Wirth,et al.  Algorithms and Data Structures , 1989, Lecture Notes in Computer Science.

[6]  Ching Y. Suen,et al.  Historical review of OCR research and development , 1992, Proc. IEEE.

[7]  Yihong Gong,et al.  Automatic parsing of news video , 1994, 1994 Proceedings of IEEE International Conference on Multimedia Computing and Systems.

[8]  Marc Davis,et al.  Media streams: representing video for retrieval and repurposing , 1994, MULTIMEDIA '94.

[9]  Shigeru Akamatsu,et al.  Recognizing Characters in Scene Images , 1994, IEEE Trans. Pattern Anal. Mach. Intell..

[10]  Stephen W. Smoliar,et al.  Video parsing, retrieval and browsing: an integrated and content-based solution , 1997, MULTIMEDIA '95.

[11]  David L. Tennenhouse,et al.  ViewStation Applications: Implications for Network Traffic , 1995, IEEE J. Sel. Areas Commun..

[12]  HongJiang Zhang,et al.  Automatic parsing of TV soccer programs , 1995, Proceedings of the International Conference on Multimedia Computing and Systems.

[13]  Rune Hjelsvold,et al.  Integrated video archive tools , 1995, MULTIMEDIA '95.

[14]  M. Smith,et al.  Video Skimming for Quick Browsing based on Audio and Image Characterization , 1995 .

[15]  Wolfgang Effelsberg,et al.  Abstracting Digital Movies Automatically , 1996, J. Vis. Commun. Image Represent..

[16]  Rainer Lienhart,et al.  Automatic text recognition in digital videos , 1995, Electronic Imaging.

[17]  Boon-Lock Yeo,et al.  Visual content highlighting via automatic extraction of embedded captions on MPEG compressed video , 1996, Electronic Imaging.