Identifying Toxicity Within YouTube Video Comment

Online Social Networks (OSNs), once regarded as safe havens for sharing information and providing mutual support among groups of people, have become breeding grounds for spreading toxic behaviors, political propaganda, and radicalizing content. Toxic individuals often hide under the auspices of anonymity to create fruitless arguments and divert the attention of other users from the core objectives of a community. In this study, we examined five recurring forms of toxicity among the comments posted on pro- and anti-NATO channels on YouTube. We leveraged the YouTube Data API to collect video and comment data from eight channels. We then utilized Google’s Perspective API to assign toxic scores to each comment. Our analysis suggests that, on average, commenters on the anti-NATO channels are more likely to be more toxic than those on the pro-NATO channels. We further discovered that commenters on pro-NATO channels tend to use a mixture of toxic and innocuous comments. We generated word clouds to get an idea of word use frequency, as well as applied the Latent Dirichlet Allocation topic model to classify the comments into their overall topics. The topics extracted from the pro-NATO channels’ comments were primarily positive, such as “Alliance” and “United”; whereas, the topics extracted from anti-NATO channels’ comments were more geared towards geographical locations, such as “Russia”, and negative components such as “Profanity” and “Fake News”. By identifying and examining the toxic behaviors of commenters on YouTube, our analysis lends aid to the pressing need for understanding this toxicity.

[1]  Ying Chen,et al.  Detecting Offensive Language in Social Media to Protect Adolescent Online Safety , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[2]  Lucas Dixon,et al.  Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.

[3]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[4]  Alexandru Iosup,et al.  Toxicity detection in multiplayer online games , 2015, 2015 International Workshop on Network and Systems Support for Games (NetGames).

[5]  Radha Poovendran,et al.  Deceiving Google's Perspective API Built for Detecting Toxic Comments , 2017, ArXiv.

[6]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[7]  Jure Leskovec,et al.  Antisocial Behavior in Online Discussion Communities , 2015, ICWSM.

[8]  Elizabeth F. Churchill,et al.  Using Crowdsourcing to Improve Profanity Detection , 2012, AAAI Spring Symposium: Wisdom of the Crowd.

[9]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[10]  Michael S. Bernstein,et al.  Anyone Can Become a Troll: Causes of Trolling Behavior in Online Discussions , 2017, CSCW.

[11]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[12]  John Suler,et al.  The Online Disinhibition Effect , 2004, Cyberpsychology Behav. Soc. Netw..

[13]  N. Hara,et al.  Beyond vandalism: Wikipedia trolls , 2010, J. Inf. Sci..

[14]  Mauro Conti,et al.  All You Need is "Love": Evading Hate Speech Detection , 2018, ArXiv.

[15]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[16]  K. Varjas,et al.  High School Students’ Perceptions of Motivations for Cyberbullying: An Exploratory Study , 2010, The western journal of emergency medicine.

[17]  Hee-Woong Kim,et al.  Why people post benevolent and malicious comments online , 2015, Commun. ACM.

[18]  Qiang Cao,et al.  Uncovering Large Groups of Active Malicious Accounts in Online Social Networks , 2014, CCS.