TV broadcast video stream consists of various kinds of programs such as sitcoms, news, sports, commercials, weather, etc. In this paper, we propose a semantic image category, named as Program Oriented Informative Images (POIM), to facilitate the segmentation, indexing and retrieval of different programs. The assumption is that most stations tend to insert lead-in/-out video shots for explicitly introducing the current program and indicating the transitions between consecutive programs within TV streams. Such shots often utilize the overlapping of text, graphics, and storytelling images to create an image sequence of POIM as a visual representation for the current program. With the advance of post-editing effects, POIM is becoming an effective indicator to structure TV streams, and also is a fairly common prop in program content production. We have attempted to develop a POIM recognizer involving a set of global/local visual features and supervised/unsupervised learning. Comparison experiments have been carried out. A promising result, F1 = 90.2%, has been achieved on a part of TRECVID 2005 video corpus. The recognition of POIM, together with other audiovisual features, can be used to further determine program boundaries.
[1]
John F. Canny,et al.
A Computational Approach to Edge Detection
,
1986,
IEEE Transactions on Pattern Analysis and Machine Intelligence.
[2]
Vladimir N. Vapnik,et al.
The Nature of Statistical Learning Theory
,
2000,
Statistics for Engineering and Information Science.
[3]
B. S. Manjunath,et al.
Texture Features for Browsing and Retrieval of Image Data
,
1996,
IEEE Trans. Pattern Anal. Mach. Intell..
[4]
Changsheng Xu,et al.
A Mid-Level Scene Change Representation Via Audiovisual Alignment
,
2006,
2006 IEEE International Conference on Acoustics Speech and Signal Processing Proceedings.
[5]
Michael I. Jordan,et al.
On Spectral Clustering: Analysis and an algorithm
,
2001,
NIPS.