Assessing Effectiveness in Video Retrieval

This paper examines results from the last two years of the TRECVID video retrieval evaluations. While there is encouraging evidence about progress in video retrieval, there are several major disappointments confirming that the field of video retrieval is still in its infancy. Many publications blithely attribute improvements in retrieval tasks to the different techniques without paying much attention to the statistical reliability of the comparisons. We conduct an analysis of the official TRECVID evaluation results, using both retrieval experiment error rates and ANOVA measures, and demonstrate that the difference between many systems is not statistically significant. We conclude the paper with the lessons learned from both results with and without statistically significant difference.