Analyzing Right-wing YouTube Channels: Hate, Violence and Discrimination

As of 2018, YouTube, the major online video sharing website, hosts multiple channels promoting right-wing content. In this paper, we observe issues related to hate, violence and discriminatory bias in a dataset containing more than 7,000 videos and 17 million comments. We investigate similarities and differences between users' comments and video content in a selection of right-wing channels and compare it to a baseline set using a three-layered approach, in which we analyze (a) lexicon, (b) topics and (c) implicit biases present in the texts. Among other results, our analyses show that right-wing channels tend to (a) contain a higher degree of words from "negative'' semantic fields, (b) raise more topics related to war and terrorism, and (c) demonstrate more discriminatory bias against Muslims (in videos) and towards LGBT people (in comments). Our findings shed light not only into the collective conduct of the YouTube community promoting and consuming right-wing content, but also into the general behavior of YouTube users.

[1]  Omer Levy,et al.  Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.

[2]  Timothy Baldwin,et al.  langid.py: An Off-the-shelf Language Identification Tool , 2012, ACL.

[3]  Lada A. Adamic,et al.  Computational Social Science , 2009, Science.

[4]  Yoshua Bengio,et al.  Investigation of recurrent-neural-network architectures and learning methods for spoken language understanding , 2013, INTERSPEECH.

[5]  Georges Matoré,et al.  La méthode en lexicologie : domaine français , 1953 .

[6]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[7]  Gianluca Stringhini,et al.  Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying , 2017, WWW.

[8]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[9]  Natalie Jomini Stroud,et al.  The Gender Gap in Online News Comment Sections , 2019, Social Science Computer Review.

[10]  Jungwoo Kim,et al.  The politics of comments: predicting political orientation of news stories with commenters' sentiment patterns , 2011, CSCW.

[11]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[12]  J. Bullinaria,et al.  Extracting semantic representations from word co-occurrence statistics: A computational study , 2007, Behavior research methods.

[13]  Petr Sojka,et al.  Software Framework for Topic Modelling with Large Corpora , 2010 .

[14]  M. Stubbs Text and Corpus Analysis: Computer-Assisted Studies of Language and Culture , 1996 .

[15]  R. Nielsen,et al.  Who Shares and Comments on News?: A Cross-National Comparative Analysis of Online and Social Media Participation , 2017 .

[16]  Matt J. Kusner,et al.  From Word Embeddings To Document Distances , 2015, ICML.

[17]  Thomas Ksiazek,et al.  User engagement with online news: Conceptualizing interactivity and exploring the relationship between online news videos and user comments , 2016, New Media Soc..

[18]  Ashish Sureka,et al.  A focused crawler for mining hate and extremism promoting videos on YouTube. , 2014, HT.

[19]  Virgílio A. F. Almeida,et al.  How you post is who you are: characterizing google+ status updates across social groups , 2014, HT.

[20]  Michael S. Bernstein,et al.  Empath: Understanding Topic Signals in Large-Scale Text , 2016, CHI.

[21]  Ponnurangam Kumaraguru,et al.  Mining YouTube to Discover Extremist Videos, Users and Hidden Communities , 2010, AIRS.

[22]  Michael S. Bernstein,et al.  Shirtless and Dangerous: Quantifying Linguistic Signals of Gender Bias in an Online Fiction Writing Community , 2016, ICWSM.

[23]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[24]  Scharolta Katharina Siencnik Adapting word2vec to Named Entity Recognition , 2015, NODALIDA.

[25]  César Nardelli Cambraia Da lexicologia social a uma lexicologia sócio-histórica: caminhos possíveis , 2013 .

[26]  A. Greenwald,et al.  Measuring individual differences in implicit cognition: the implicit association test. , 1998, Journal of personality and social psychology.

[27]  Jason Weston,et al.  Natural Language Processing (Almost) from Scratch , 2011, J. Mach. Learn. Res..

[28]  Michalis Faloutsos,et al.  TrollSpot: Detecting misbehavior in commenting platforms , 2017, ASONAM.

[29]  Arvind Narayanan,et al.  Semantics derived automatically from language corpora contain human-like biases , 2016, Science.

[30]  Virgílio A. F. Almeida,et al.  "Like Sheep Among Wolves": Characterizing Hateful Users on Twitter , 2017, ArXiv.

[31]  Saiph Savage,et al.  Participatory Militias: An Analysis of an Armed Movement's Online Audience , 2015, CSCW.

[32]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..