The Shanghai-Hongkong Team at MediaEval2012: Violent Scene Detection Using Trajectory-based Features

The Violent Scene Detection task oers a very practical challenge in detecting complex and diverse violent video clips in movies. In this working note paper, we will briey describe our system and discuss the results, which achieved top performance in mAP@20 1 and runner-up in mAP@100, among all 35 submissions worldwide. The central component of our system is a set of features derived from the appearance and motion of local patch trajectories [2]. We use these features and SVM classier as the baseline approach and add in a few other components to further improve the performance. Our ndings indicate that the trajectory-based visual features already oer very competitive results. Other audio-visual features like SpatialTemporal Interest Points and MFCC do not signicantly enhance the performance. In addition, smoothing detection scores of nearby shots leads to signicant improvement. We conclude that|while audio feature may help marginally| good visual features are still the key factor in violent scene detection, and temporal information is very useful.