For the first time in 2007, TRECVID considered structured evaluation of automated video summarization, utilizing BBC rushes video. In 2007, we conducted user evaluations with the published TRECVID summary assessment procedure to rate a cluster method for producing summaries, a 25x (sampling every 25th frame), and pz (emphasizing pans and zooms). Data from 4 human assessors shows significant differences between the cluster, pz, and 25x approaches. The best coverage (text inclusion performance) is obtained by 25x, but at the expense of 25x taking the most time to evaluate and judged as being the most redundant. Method pz was easier to use than cluster and rated best on redundancy. A question following the TRECVID workshop was whether simple speed-ups would still work at 50x or 100x, leading to a study with 15 human assessors looking at pzA (pz but with better audio), 25x, 50x, and 100x summaries (these latter 3 with an unsynchronized more comprehensive audio track as well). 100x gives the fastest time on task but with poor usability and performance. PzA gives the best usability measures but poor time on task and performance. 25x does well on performance as before, with 50x doing just as well but with much less time on task and better ease of use and redundancy scores. Based on these results, 50x with its audio skimming is recommended as the best way to summarize video rushes materials.
[1]
Wei-Hao Lin,et al.
Clever clustering vs. simple speed-up for summarizing rushes
,
2007,
TVS '07.
[2]
Paul Over,et al.
The trecvid 2007 BBC rushes summarization evaluation pilot
,
2007,
TVS '07.
[3]
Wei-Hao Lin,et al.
Clever Clustering vs . Simple Speed-Up for Summarizing BBC Rushes
,
2007
.
[4]
Gary Marchionini,et al.
How fast is too fast? evaluating fast forward surrogates for digital video
,
2003,
2003 Joint Conference on Digital Libraries, 2003. Proceedings..
[5]
Barry Arons,et al.
SpeechSkimmer: a system for interactively skimming recorded speech
,
1997,
TCHI.
[6]
Gary Marchionini,et al.
Effects of audio and visual surrogates for making sense of digital video
,
2007,
CHI.
[7]
Ba Tu Truong,et al.
Video abstraction: A systematic review and classification
,
2007,
TOMCCAP.
[8]
Alan Hanjalic,et al.
Shot-boundary detection: unraveled and resolved?
,
2002,
IEEE Trans. Circuits Syst. Video Technol..
[9]
Michael G. Christel,et al.
Evolving video skims into useful multimedia abstractions
,
1998,
CHI.
[10]
Zygmunt Pizlo,et al.
Automated video program summarization using speech transcripts
,
2006,
IEEE Transactions on Multimedia.