Graph-Based Features for Automatic Online Abuse Detection

While online communities have become increasingly important over the years, the moderation of user-generated content is still performed mostly manually. Automating this task is an important step in reducing the financial cost associated with moderation, but the majority of automated approaches strictly based on message content are highly vulnerable to intentional obfuscation. In this paper, we discuss methods for extracting conversational networks based on raw multi-participant chat logs, and we study the contribution of graph features to a classification system that aims to determine if a given message is abusive. The conversational graph-based system yields unexpectedly high performance, with results comparable to those previously obtained with a content-based approach.

[1]  Frank Harary,et al.  Graph Theory , 2016 .

[2]  L. Freeman Centrality in social networks conceptual clarification , 1978 .

[3]  Stephen B. Seidman,et al.  Network structure and minimum degree , 1983 .

[4]  P. Bonacich Power and Centrality: A Family of Measures , 1987, American Journal of Sociology.

[5]  Ellen Spertus,et al.  Smokey: Automatic Recognition of Hostile Messages , 1997, AAAI/IAAI.

[6]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[7]  M. KleinbergJon Authoritative sources in a hyperlinked environment , 1999 .

[8]  M E J Newman Assortative mixing in networks. , 2002, Physical review letters.

[9]  Paul Mutton,et al.  Inferring and visualizing social networks on Internet relay chat , 2004, Proceedings. Eighth International Conference on Information Visualisation, 2004. IV 2004..

[10]  Gábor Csárdi,et al.  The igraph software package for complex network research , 2006 .

[11]  Brian D. Davison,et al.  Detection of Harassment on Web 2.0 , 2009 .

[12]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[13]  Henry Lieberman,et al.  Modeling the Detection of Textual Cyberbullying , 2011, The Social Mobile Web.

[14]  Ying Chen,et al.  Detecting Offensive Language in Social Media to Protect Adolescent Online Safety , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[15]  Indra Rajasingh,et al.  Investigating substructures in goal oriented online communities: Case study of Ubuntu IRC , 2014, 2014 IEEE International Advance Computing Conference (IACC).

[16]  Katharina Anna Zweig,et al.  Constructing social networks from semi-structured chat-log data , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[17]  Jure Leskovec,et al.  Antisocial Behavior in Online Discussion Communities , 2015, ICWSM.

[18]  Vikas S. Chavan,et al.  Machine learning approach for detection of cyber-aggressive comments by peers on social media network , 2015, 2015 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[19]  Albert Ali Salah,et al.  Automatic analysis and identification of verbal aggression and abusive behaviors for online social games , 2015, Comput. Hum. Behav..

[20]  Radha Poovendran,et al.  Deceiving Google's Perspective API Built for Detecting Toxic Comments , 2017, ArXiv.

[21]  Georges Linarès,et al.  Impact Of Content Features For Automatic Online Abuse Detection , 2017, CICLing.