Detecting and visualizing hate speech in social media: A cyber Watchdog for surveillance

Abstract The multi-fold growth of the social media user-base fuelled a substantial increase in the amount of hate speech posts on social media platforms. The enormous data volume makes it hard to capture such cases and either moderate or delete them. This paper presents an approach to detect and visualize online aggression, a special case of hate speech, over social media. Aggression is categorized into overtly aggressive (OAG), covertly aggressive (CAG), and non-aggressive labels (NAG). We have designed a user interface based on a web browser plugin over Facebook and Twitter to visualize the aggressive comments posted on the Social media user’s timelines. This plugin interface might help to the security agency to keep a tab on the social media stream. It also provides citizens with a tool that is typically only available for large enterprises. The availability of such a tool alleviates the technological imbalance between industry and citizens. Besides, the system might be helpful to the research community to create further tools and prepare weakly labeled training data in a few minutes using comments posted by users on celebrity’s Facebook, Twitter timeline. We have reported the results on a newly created dataset of user comments posted on Facebook and Twitter using our proposed plugins and the standard Trolling Aggression Cyberbullying 2018 (TRAC) dataset in English and code-mixed Hindi. Various classifiers like Support Vector Machine (SVM), Logistic regression, deep learning model based on Convolution Neural Network (CNN), Attention-based model, and the recently proposed BERT pre-trained language model by Google AI, have been used for aggression classification. The weighted F1-score of around 0.64 and 0.62 is achieved on TRAC Facebook English and Hindi datasets while on Twitter English and Hindi datasets, the weighted F1-score is 0.58 and 0.50, respectively.

[1]  Ponnurangam Kumaraguru,et al.  Facebook Inspector (FbI): Towards automatic real-time detection of malicious content on Facebook , 2017, Social Network Analysis and Mining.

[2]  Rudresh Panchal,et al.  Online hatred of women in the Incels.me forum , 2019, Journal of Language Aggression and Conflict.

[3]  Jose Emmanuel Ramirez-Marquez,et al.  Towards computational discourse analysis: A methodology for mining Twitter backchanneling conversations , 2016, Comput. Hum. Behav..

[4]  Jonathan Seglow,et al.  Hate Speech, Dignity and Self-Respect , 2016 .

[5]  Carlos Guestrin,et al.  "Why Should I Trust You?": Explaining the Predictions of Any Classifier , 2016, ArXiv.

[6]  Justin W. Patchin,et al.  Bullying, Cyberbullying, and Suicide , 2010, Archives of suicide research : official journal of the International Academy for Suicide Research.

[7]  David M. W. Powers,et al.  Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation , 2011, ArXiv.

[8]  Ankit Srivastava,et al.  Automatic Classification of Abusive Language and Personal Attacks in Various Forms of Online Communication , 2017, GSCL.

[9]  Stavros Assimakopoulos,et al.  Online Hate Speech in the European Union: A Discourse-Analytic Perspective , 2017 .

[10]  Stan Matwin,et al.  Offensive Language Detection Using Multi-level Classification , 2010, Canadian Conference on AI.

[11]  H A Dengerink,et al.  The role of perceived versus actual attack in human physical aggression. , 1973, Journal of personality and social psychology.

[12]  Sérgio Nunes,et al.  A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[13]  Robbi Rahim,et al.  Prototype Application Hate Speech Detection Website Using String Matching and Searching Algorithm , 2018 .

[14]  Jia Lu,et al.  Does The internet make us more intolerant? A contextual analysis in 33 countries , 2018, Information, Communication & Society.

[15]  Nazli Goharian,et al.  Hate speech detection: Challenges and solutions , 2019, PloS one.

[16]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[17]  Shervin Malmasi,et al.  Challenges in discriminating profanity from hate speech , 2017, J. Exp. Theor. Artif. Intell..

[18]  Heri Ramampiaro,et al.  Effective hate-speech detection in Twitter data using recurrent neural networks , 2018, Applied Intelligence.

[19]  Stefan Braun Democracy off Balance: Freedom of Expression and Hate Propaganda Law in Canada , 2004 .

[20]  Tomas Mikolov,et al.  Advances in Pre-Training Distributed Word Representations , 2017, LREC.

[21]  Hyuk Tae Kwon,et al.  An information-theoretic evaluation of narrative complexity for interactive writing support , 2016, Expert Syst. Appl..