Detecting Nastiness in Social Media

Although social media has made it easy for people to connect on a virtually unlimited basis, it has also opened doors to people who misuse it to undermine, harass, humiliate, threaten and bully others. There is a lack of adequate resources to detect and hinder its occurrence. In this paper, we present our initial NLP approach to detect invective posts as a first step to eventually detect and deter cyberbullying. We crawl data containing profanities and then determine whether or not it contains invective. Annotations on this data are improved iteratively by in-lab annotations and crowdsourcing. We pursue different NLP approaches containing various typical and some newer techniques to distinguish the use of swear words in a neutral way from those instances in which they are used in an insulting way. We also show that this model not only works for our data set, but also can be successfully applied to different data sets.

[1]  Walter Daelemans,et al.  Detection and Fine-Grained Classification of Cyberbullying Events , 2015, RANLP.

[2]  S. Perren,et al.  Is Cyberbullying Worse than Traditional Bullying? Examining the Differential Roles of Medium, Publicity, and Anonymity for the Perceived Severity of Bullying , 2012, Journal of Youth and Adolescence.

[3]  Narendra Shekokar,et al.  A Framework for Cyberbullying Detection in Social Network , 2015 .

[4]  Gregory Epiphaniou,et al.  How technology can mitigate and counteract cyber-stalking and online grooming , 2016 .

[5]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[6]  Henry Lieberman,et al.  Common Sense Reasoning for Detection, Prevention, and Mitigation of Cyberbullying , 2012, TIIS.

[7]  Brendan T. O'Connor,et al.  Improved Part-of-Speech Tagging for Online Conversational Text with Word Clusters , 2013, NAACL.

[8]  Brian D. Davison,et al.  Detection of Harassment on Web 2.0 , 2009 .

[9]  Shivakant Mishra,et al.  Towards understanding cyberbullying behavior in a semi-anonymous social network , 2014, 2014 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2014).

[10]  Henry Lieberman,et al.  Script-based story matching for cyberbullying prevention , 2013, CHI Extended Abstracts.

[11]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[12]  Andrea Esuli,et al.  SentiWordNet 3.0: An Enhanced Lexical Resource for Sentiment Analysis and Opinion Mining , 2010, LREC.

[13]  Jun-Ming Xu,et al.  Learning from Bullying Traces in Social Media , 2012, NAACL.

[14]  Henry Lieberman,et al.  Let's Gang Up on Cyberbullying , 2011, Computer.

[15]  L. Haddon,et al.  Risks and safety on the internet: the perspective of European children: key findings from the EU Kids Online survey of 9-16 year olds and their parents in 25 countries , 2010 .

[16]  Lucas Dixon,et al.  Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[17]  Justin W. Patchin,et al.  Cyberbullying and self-esteem. , 2010, The Journal of school health.

[18]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.