Caption text recognition in video frames by MAP matching

In this paper, an approach to detection of caption text invideo frames is described. Text recognition in video can beapplied to various applications, however there are stillproblematic issues such as insufficient resolution,complexity of layouts and backgrounds. This study attemptsto solve these problems with a segmentation-free approach,called MAP matching method. Besides extending the methodto grayscale images, a strategy for character size variationusing Gaussian filtering and multi-sized reference patternsis discussed, as well as a method for detecting framescontaining caption text. Results show the proposed matchingmethod is able to detect characters of unknown size incaption text. Although over-detection is not negligible,verifying the positions of detected characters can identifythe location of keywords with practical precision. It is alsoshown that the frames containing caption text are detectedwith nearly 98% accuracy.

[1]  Richard M. Schwartz,et al.  Videotext OCR using hidden Markov models , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[2]  Proceedings Seventh International Conference on Document Analysis and Recognition , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..

[3]  Margrit Betke,et al.  Information-conserving object recognition , 1998, Sixth International Conference on Computer Vision (IEEE Cat. No.98CH36271).

[4]  Hiromitsu Yamada,et al.  Directional Mathematical Morphology and Reformalized Hough Transformation for the Analysis of Topographic Maps , 1993, IEEE Trans. Pattern Anal. Mach. Intell..

[5]  Takeshi Mita,et al.  Improvement of video text recognition by character selection , 2001, Proceedings of Sixth International Conference on Document Analysis and Recognition.

[6]  Hiromitsu Yamada,et al.  Recognition of Elevation Value in Topographic Maps by Multi-Angled Parallelism , 1994, Int. J. Pattern Recognit. Artif. Intell..

[7]  Qian Huang,et al.  Character extraction of license plates from video , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[8]  Jean-Michel Jolion,et al.  Text localization, enhancement and binarization in multimedia documents , 2002, Object recognition supported by user interaction for service robots.