Evaluation of Active Learning Strategies for Video Indexing

In this paper, we compare active learning strategies for indexing concepts in video shots. Active learning is simulated using subsets of a fully annotated dataset instead of actually calling for user intervention. Training is done using the collaborative annotation of 39 concepts of the TRECVID 2005 campaign. Performance is measured on the 20 concepts selected for the TRECVID 2006 concept detection task. The simulation allows exploring the effect of several parameters: the strategy, the annotated fraction of the dataset, the number of iterations and the relative difficulty of concepts. Three strategies were compared. The first two respectively select the most probable and the most uncertain samples. The third one is a random choice. For easy concepts, the "most probable" strategy is the best one when less than 15% of the dataset is annotated and the "most uncertain" strategy is the best one when 15% or more of the dataset is annotated. The "most probable" and "most uncertain" strategies are roughly equivalent for moderately difficult and difficult concepts. In all cases, the maximum performance is reached when 12 to 15% of the whole dataset is annotated.