Support vector machine-based text detection in digital video

Textual data within video frames are very useful for describing the contents of the video frames, as they enable both keyword and free-text-based searching. In this paper, we pose the problem of text location in digital video as an example of supervised texture classification and use a support vector machine (SVM) as the texture classifier. Unlike other text detection methods, we do not incorporate any explicit texture feature extraction scheme. Instead, the gray-level values of the raw pixels are directly fed to the classifier. This is based on the observation that a SVM has the capability of learning in a high-dimensional space and of incorporating a feature extraction scheme in its own architecture. In comparison with a neural network-based text detection method, the SVM classifier illustrates the excellence of the proposed method.

[1]  David S. Doermann,et al.  Automatic text tracking in digital videos , 1998, 1998 IEEE Second Workshop on Multimedia Signal Processing (Cat. No.98EX175).

[2]  Hongjoo Kim,et al.  Supervised texture segmentation using support vector machines , 1999 .

[3]  Anil K. Jain,et al.  Automatic text location in images and video frames , 1998, Proceedings. Fourteenth International Conference on Pattern Recognition (Cat. No.98EX170).

[4]  Bernhard Schölkopf,et al.  Support vector learning , 1997 .

[5]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.