Classification Performance of Violence Content by Deep Neural Network with Monarch Butterfly Optimization

Violence is self-sufficient, it is perplexing due to visibility of content dissimilarities among the positive instances that been displayed on media. Besides, the ever-increasing demand on internet, with various types of videos and genres, causes difficulty for a proper search of these videos to ensure the contents is humongous. It involves in aiding users to choose movies or web videos suitable for audience, in terms of classifying violence content. Nevertheless, this is a cumbersome job since the definition of violence is broad and subjective. Detecting such nuances from videos becomes technical without a human’s supervision that can lead to conceptual problem. Generally, violence classification is performed based on text, audio, and visual features; to be precise, it is more relevant to use of audio and visual base. However, from this perspective, deep neural network is the current build-up in machine learning approach to solve classification problems. In this research, audio and visual features are learned by the deep neural network for more specific violence content classification. This study has explored the implementation of deep neural network with monarch butterfly optimization (DNNMBO) to effectively perform the classification of the violence content in web videos. Hence, the experiments are conducted using YouTube videos from VSD2014 dataset that are publicly available by Technicolor group. The results are compared with similar modified approaches such as DNNPSO and the original DNN. The outcome has shown 94% of violence classification rate by DNNMBO.

[1]  Qin Jin,et al.  Violent Scene Detection Using Convolutional Neural Networks and Deep Audio Features , 2016, CCPR.

[2]  Markus Schedl,et al.  VSD2014: A dataset for violent scenes detection in hollywood movies and web videos , 2015, 2015 13th International Workshop on Content-Based Multimedia Indexing (CBMI).

[3]  Jinhui Tang,et al.  Fudan-NJUST at MediaEval 2014: Violent Scenes Detection Using Deep Neural Networks , 2014, MediaEval.

[4]  M. Bowie Media violence. , 1997, South African medical journal = Suid-Afrikaanse tydskrif vir geneeskunde.

[5]  Bowen Zhang,et al.  MIC-TJU at MediaEval Violent Scenes Detection (VSD) 2014 , 2014, MediaEval.

[6]  Sergios Theodoridis,et al.  Violence Content Classification Using Audio Features , 2006, SETN.

[7]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[8]  Juhan Nam,et al.  Multimodal Deep Learning , 2011, ICML.

[9]  Robert Hecht-Nielsen,et al.  Theory of the backpropagation neural network , 1989, International 1989 Joint Conference on Neural Networks.

[10]  Oswald Lanz,et al.  Convolutional Long Short-Term Memory Networks for Recognizing First Person Interactions , 2017, 2017 IEEE International Conference on Computer Vision Workshops (ICCVW).

[11]  Jun Wang,et al.  Exploring Inter-feature and Inter-class Relationships with Deep Neural Networks for Video Classification , 2014, ACM Multimedia.

[12]  Reza Tavakkoli-Moghaddam,et al.  The Social Engineering Optimizer (SEO) , 2018, Eng. Appl. Artif. Intell..

[13]  Matthew J. Hausknecht,et al.  Beyond short snippets: Deep networks for video classification , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[14]  Tara N. Sainath,et al.  Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.

[15]  Suash Deb,et al.  A Novel Monarch Butterfly Optimization with Greedy Strategy and Self-Adaptive , 2015, 2015 Second International Conference on Soft Computing and Machine Intelligence (ISCMI).

[16]  Anderson Rocha,et al.  Breaking down violence: A deep-learning strategy to model and classify violence in videos , 2018, ARES.

[17]  Norhalina Senan,et al.  A Review on Violence Video Classification Using Convolutional Neural Networks , 2016, SCDM.

[18]  Max Welling,et al.  Semi-Supervised Classification with Graph Convolutional Networks , 2016, ICLR.

[19]  Nicu Sebe,et al.  A modified vector of locally aggregated descriptors approach for fast video classification , 2016, Multimedia Tools and Applications.

[20]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[21]  Ronald M. Summers,et al.  Deep Convolutional Neural Networks for Computer-Aided Detection: CNN Architectures, Dataset Characteristics and Transfer Learning , 2016, IEEE Transactions on Medical Imaging.

[22]  Shih-Fu Chang,et al.  Exploiting Feature and Class Relationships in Video Categorization with Regularized Deep Neural Networks , 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[23]  R. Dinesh Jackson Samuel,et al.  Real time violence detection framework for football stadium comprising of big data analysis and deep learning through bidirectional LSTM , 2019, Comput. Networks.

[24]  Karol J. Piczak Environmental sound classification with convolutional neural networks , 2015, 2015 IEEE 25th International Workshop on Machine Learning for Signal Processing (MLSP).

[25]  Tal Hassner,et al.  Age and gender classification using convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW).

[26]  Andrew Zisserman,et al.  Two-Stream Convolutional Networks for Action Recognition in Videos , 2014, NIPS.

[27]  Zhihua Cui,et al.  Monarch butterfly optimization , 2015, Neural Computing and Applications.