Detecting and classifying online dark visual propaganda

Abstract The staggering increase in the amount of information on the World Wide Web (referred to as Web) has made Web page classification essential to retrieve useful information while filtering out unwanted, futile, or harmful contents. This massive information-sharing platform is occasionally abused for propagating extreme and radical ideologies and posing threats to national security and citizens. Detecting the so called dark material has gained more impetus following the recent outbreak of extremist groups and radical ideologies across the Web. The goal of this project, being the first of its own, is to surveil online social networks (OSN) and Web for real-time detection of visual propaganda by violent extremist organizations (VEOs). This is valuable not only for flagging and removing such content from OSN and Web, but also to provide military insight and narrative context inside VEOs. Visual propaganda by VEOs are not only detected, but also further classified based on the type of VEO and focus or intent of the image into hard propaganda, soft propaganda, symbolic propaganda, landscape, and organizational communications. Over 1.2 million images were automatically collected from suspicious OSN accounts and Web pages over a course of four years. Out of which, 120,000 images were manually classified to provide the training data for a convolutional neural network. An overall generalization accuracy of 97.02% and F1 of 97.89% were achieved for a binary classification or mere detection of visual VEO propaganda and an overall generalization accuracy of 86.08% and F 1 ¯ of 85.76% for an eight-way classification based on the intent of the image.

[1]  Meng Wang,et al.  Adaptive Hypergraph Learning and its Application in Image Classification , 2012, IEEE Transactions on Image Processing.

[2]  Mahdi Hashemi,et al.  Visualization, Feature Selection, Machine Learning: Identifying the Responsible Group for Extreme Acts of Violence , 2018, IEEE Access.

[3]  Hsinchun Chen,et al.  A Comparison of Tools for Detecting Fake Websites , 2009, Computer.

[4]  Geoffrey E. Hinton,et al.  Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.

[5]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[6]  Athanasios Mazarakis,et al.  Editorial of the Special Issue on Following User Pathways: Key Contributions and Future Directions in Cross-Platform Social Media Research , 2018, Int. J. Hum. Comput. Interact..

[7]  Christopher C. Yang,et al.  An analysis of user influence ranking algorithms on Dark Web forums , 2010, ISI-KDD '10.

[8]  Steve Kramer,et al.  Anomaly detection in extremist web forums using a dynamical systems approach , 2010, ISI-KDD '10.

[9]  Hamidreza Alvari,et al.  Semi-supervised learning for detecting human trafficking , 2017, Security Informatics.

[10]  Zhouyu Fu,et al.  Recognition of Pornographic Web Pages by Classifying Texts and Images , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[11]  Ali Ahmadi,et al.  Intelligent classification of web pages using contextual and visual features , 2011, Appl. Soft Comput..

[12]  David Suter,et al.  Recognition of adult images, videos, and web page bags , 2011, TOMCCAP.

[13]  Rada Mihalcea,et al.  Multimodal Analysis and Prediction of Latent User Dimensions , 2017, SocInfo.

[14]  Mahdi Hashemi,et al.  Identifying the Responsible Group for Extreme Acts of Violence Through Pattern Recognition , 2018, HCI.

[15]  Nur Al Hasan Haldar,et al.  BiSAL - A bilingual sentiment analysis lexicon to analyze Dark Web forums for cyber security , 2015 .

[16]  Rob Fergus,et al.  Visualizing and Understanding Convolutional Networks , 2013, ECCV.

[17]  Meng Wang,et al.  Visual Classification by ℓ1-Hypergraph Modeling , 2015, IEEE Trans. Knowl. Data Eng..

[18]  Dumitru Erhan,et al.  Going deeper with convolutions , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[19]  Tommy W. S. Chow,et al.  Textual and Visual Content-Based Anti-Phishing: A Bayesian Approach , 2011, IEEE Transactions on Neural Networks.

[20]  Richard Alan Nelson,et al.  A chronology and glossary of propaganda in the United States , 1996 .