Automatic performance evaluation for video text detection

We propose an objective, comprehensive and difficulty-independent performance evaluation protocol for video text detection algorithms. The protocol includes a positive set and a negative set of indices at textbox level, which evaluate the detection quality in terms of both location accuracy and fragmentation of the detected textboxes. In the protocol, we assign a detection difficulty (DD) level to each ground truth textbox. The performance indices can then be normalized with respect to the textbox DD level and are therefore independent of the ground truth difficulty. We also assign a detection importance (DI) level to each ground truth textbox. The overall detection rate is the DI-weighted average of the detection qualities of all ground truth textboxes, which makes the detection rate more accurate to reveal the real performance. The automatic performance evaluation scheme has been applied on a text detection approach to determine the best parameters that can yield the best detection results.

[1]  Ching Y. Suen,et al.  Evaluation of thinning algorithms from an OCR viewpoint , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[2]  Rainer Lienhart,et al.  On the segmentation of text in videos , 2000, 2000 IEEE International Conference on Multimedia and Expo. ICME2000. Proceedings. Latest Advances in the Fast Changing World of Multimedia (Cat. No.00TH8532).

[3]  Dov Dori,et al.  A Proposed Scheme for Performance Evaluation of Graphics/Text Separation Algorithms , 1997, GREC.

[4]  Edward M. Riseman,et al.  Finding text in images , 1997, DL '97.

[5]  Anil K. Jain,et al.  Automatic caption localization in compressed video , 1999, Proceedings 1999 International Conference on Image Processing (Cat. 99CH36348).

[6]  Anil K. Jain,et al.  Automatic Caption Localization in Compressed Video , 2000, IEEE Trans. Pattern Anal. Mach. Intell..

[7]  Hao Jiang,et al.  Integrating visual, audio and text analysis for news video , 2000, Proceedings 2000 International Conference on Image Processing (Cat. No.00CH37101).

[8]  Kevin W. Bowyer,et al.  Introduction to the Special Section on Empirical Evaluation of Computer Vision Algorithms , 1999, IEEE Trans. Pattern Anal. Mach. Intell..

[9]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[10]  David S. Doermann,et al.  Automatic text detection and tracking in digital video , 2000, IEEE Trans. Image Process..

[11]  Robert M. Haralick,et al.  A Performance Evaluation Protocol for Graphics Recognition Systems , 1997, GREC.