The Use and Utility of High-Level Semantic Features in Video Retrieval

This paper investigates the applicability of high-level semantic features for video retrieval using the benchmarked data from TRECVID 2003 and 2004, addressing the contributions of features like outdoor, face, and animal in retrieval, and if users can correctly decide on which features to apply for a given need. Pooled truth data gives evidence that some topics would benefit from features. A study with 12 subjects found that people often disagree on the relevance of a feature to a particular topic, including disagreement within the 8% of positive feature-topic associations strongly supported by truth data. When subjects concur, their judgments are correct, and for those 51 topic-feature pairings identified as significant we conduct an investigation into the best interactive search submissions showing that for 29 pairs, topic performance would have improved had users had access to ideal classifiers for those features. The benefits derive from generic features applied to generic topics (27 pairs), and in one case a specific feature applied to a specific topic. Re-ranking submitted shots based on features shows promise for automatic search runs, but not for interactive runs where a person already took care to rank shots well.