For the first time in 2007, TRECVID considered structured evaluation of automated video summarization, utilizing BBC rushes video. This paper discusses in detail our approaches for producing the submitted summaries to TRECVID, including the two baseline methods. The cluster method performed well in terms of coverage, and adequately in terms of user satisfaction, but did take longer to review. We conducted additional evaluations using the same TRECVID assessment interface to judge 2 additional methods for summary generation: 25x (simple speed-up by 25 times), and pz (emphasizing pans and zooms). Data from 4 human assessors shows significant differences between the cluster, pz, and 25x approaches. The best coverage (text inclusion performance) is obtained by 25x, but at the expense of taking the most time to evaluate and perceived as the most redundant. Method pz was easier to use than cluster and had better performance on pan/zoom recall tasks, leading into discussions on how summaries can be improved with more knowledge of the anticipated users and tasks.
[1]
Michael G. Christel,et al.
Evolving video skims into useful multimedia abstractions
,
1998,
CHI.
[2]
Wei-Hao Lin,et al.
Structuring continuous video recordings of everyday life using time-constrained clustering
,
2006,
Electronic Imaging.
[3]
Alan Hanjalic,et al.
Shot-boundary detection: unraveled and resolved?
,
2002,
IEEE Trans. Circuits Syst. Video Technol..
[4]
Paul Over,et al.
The trecvid 2007 BBC rushes summarization evaluation pilot
,
2007,
TVS '07.