A large-scale benchmark dataset for event recognition in surveillance video