Summarization scheme based on near-duplicate analysis

This paper presents our approach to select relevant sequences from raw videos in order to generate summaries to Trecvid 2008 BBC Rush Task. Our system is composed of two major steps: First, the system detects "semantic" shot boundaries and keeps only non-redundant shots; then, the system estimates average motion for each shot, as a criterion of amount of information, to better share out the duration of the summary between remaining shots. The first step is based on a fast near-duplicate retrieval using Locality Sensitive Hashing (LSH) which provides results in few seconds (if we do not take into account decoding and encoding processes). The evaluation of Trecvid shows very promising results, since we ranked 17th over 43 runs, regarding redundancy measure (RE), and 18th for object and event inclusion (IN). These balanced results (most of best teams for the first criterion are among the latest for the second one) show that our method offers a quite good trade-off between false negatives (IN) and false positives (RE).