VSD2014: A dataset for violent scenes detection in hollywood movies and web videos

In this paper, we introduce a violent scenes and violence-related concept detection dataset named VSD2014. It contains annotations as well as auditory and visual features of Hollywood movies and user-generated footage shared on the web. The dataset is the result of a joint annotation endeavor of different research institutions and responds to the real-world use case of parental guidance in selecting appropriate content for children. The dataset has been validated during the Violent Scenes Detection (VSD) task at the MediaEval benchmarking initiative for multimedia evaluation.

[1]  G LoweDavid,et al.  Distinctive Image Features from Scale-Invariant Keypoints , 2004 .

[2]  Frank Hopfgartner,et al.  Violence detection in hollywood movies by the fusion of visual and mid-level audio cues , 2013, ACM Multimedia.

[3]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[4]  Bowen Zhang,et al.  MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014 , 2014, MediaEval.

[5]  Man-Wai Mak,et al.  Gaussian Mixture Models and Probabilistic Decision-Based Neural Networks for Pattern Classification: A Comparative Study , 1999, Neural Computing & Applications.

[6]  Ramesh Jain,et al.  Storage and Retrieval for Image and Video Databases III , 1995 .

[7]  Bruce W. Suter,et al.  The multilayer perceptron as an approximation to a Bayes optimal discriminant function , 1990, IEEE Trans. Neural Networks.

[8]  Mohammad Soleymani,et al.  Multimodal Violence Detection in Hollywood Movies: State-of-the-Art and Benchmarking , 2014, Fusion in Computer Vision.

[9]  Vanessa Testoni,et al.  RECOD at MediaEval 2014: Violent Scenes Detection Task , 2014, MediaEval.

[10]  Vu Lam,et al.  NII-UIT at MediaEval 2014 Violent Scenes Detection Affect Task , 2013, MediaEval.

[11]  Matthijs C. Dorst Distinctive Image Features from Scale-Invariant Keypoints , 2011 .

[12]  Sergios Theodoridis,et al.  Audio-Visual Fusion for Detecting Violent Scenes in Videos , 2010, SETN.

[13]  Cordelia Schmid,et al.  Action Recognition with Improved Trajectories , 2013, 2013 IEEE International Conference on Computer Vision.

[14]  Arnaldo de Albuquerque Araújo,et al.  Violence Detection in Video Using Spatio-Temporal Features , 2010, 2010 23rd SIBGRAPI Conference on Graphics, Patterns and Images.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Markus A. Stricker,et al.  Similarity of color images , 1995, Electronic Imaging.

[17]  Markus Schedl,et al.  FAR at MediaEval 2014 Violent Scenes Detection: A Concept-based Fusion Approach , 2014, MediaEval.

[18]  Sahin Albayrak,et al.  TUB-IRML at MediaEval 2014 Violent Scenes Detection Task: Violence Modeling through Feature Space Partitioning , 2014, MediaEval.

[19]  Jinhui Tang,et al.  Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Networks , 2014, MediaEval.

[20]  Dariu M. Gavrila,et al.  Audio-video sensor fusion for aggression detection. , 2007 .

[21]  Jun Wang,et al.  Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification , 2014, ACM Multimedia.

[22]  Shuicheng Yan,et al.  An HOG-LBP human detector with partial occlusion handling , 2009, 2009 IEEE 12th International Conference on Computer Vision.

[23]  Mohammad Soleymani,et al.  VSD, a public dataset for the detection of violent scenes in movies: design, annotation, analysis and evaluation , 2014, Multimedia Tools and Applications.

[24]  Patrick Gros,et al.  Multimodal information fusion and temporal integration for violence detection in movies , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[25]  Li-Yun Wang,et al.  Violence Detection in Movies , 2011, 2011 Eighth International Conference Computer Graphics, Imaging and Visualization.

[26]  Johannes D. Krijnders,et al.  CASSANDRA: audio-video sensor fusion for aggression detection , 2007, 2007 IEEE Conference on Advanced Video and Signal Based Surveillance.

[27]  Bruno do Nascimento Teixeira MTM at MediaEval 2014 Violence Detection , 2014, MediaEval.

[28]  Mubarak Shah,et al.  Person-on-person violence detection in video data , 2002, Object recognition supported by user interaction for service robots.

[29]  Thomas Mensink,et al.  Improving the Fisher Kernel for Large-Scale Image Classification , 2010, ECCV.

[30]  Wen Gao,et al.  Detecting Violent Scenes in Movies by Auditory and Visual Cues , 2008, PCM.

[31]  Vu Lam,et al.  NII, Japan at MediaEval 2012 Violent Scenes Detection Affect Task , 2012, MediaEval.

[32]  Markus Schedl,et al.  A naive mid-level concept-based fusion approach to violence detection in Hollywood movies , 2013, ICMR '13.

[33]  Carlos Orrite-Uruñuela,et al.  ViVoLab and CVLab - MediaEval 2014: Violent Scenes Detection Affect Task , 2014, MediaEval.

[34]  Markus Schedl,et al.  Benchmarking Violent Scenes Detection in movies , 2014, 2014 12th International Workshop on Content-Based Multimedia Indexing (CBMI).

[35]  Rahul Sukthankar,et al.  Violence Detection in Video Using Computer Vision Techniques , 2011, CAIP.

[36]  Markus Schedl,et al.  The MediaEval 2013 Affect Task: Violent Scenes Detection , 2013, MediaEval.