Quarantining online hate speech: technical and ethical perspectives

In this paper we explore quarantining as a more ethical method for delimiting the spread of Hate Speech via online social media platforms. Currently, companies like Facebook, Twitter, and Google generally respond reactively to such material: offensive messages that have already been posted are reviewed by human moderators if complaints from users are received. The offensive posts are only subsequently removed if the complaints are upheld; therefore, they still cause the recipients psychological harm. In addition, this approach has frequently been criticised for delimiting freedom of expression, since it requires the service providers to elaborate and implement censorship regimes. In the last few years, an emerging generation of automatic Hate Speech detection systems has started to offer new strategies for dealing with this particular kind of offensive online material. Anticipating the future efficacy of such systems, the present article advocates an approach to online Hate Speech detection that is analogous to the quarantining of malicious computer software. If a given post is automatically classified as being harmful in a reliable manner, then it can be temporarily quarantined, and the direct recipients can receive an alert, which protects them from the harmful content in the first instance. The quarantining framework is an example of more ethical online safety technology that can be extended to the handling of Hate Speech. Crucially, it provides flexible options for obtaining a more justifiable balance between freedom of expression and appropriate censorship.

[1]  Shivakant Mishra,et al.  Prediction of Cyberbullying Incidents on the Instagram Social Network , 2015, ArXiv.

[2]  Animesh Mukherjee,et al.  Thou shalt not hate: Countering Online Hate Speech , 2018, ICWSM.

[3]  Boualem Benatallah,et al.  Bots Acting Like Humans: Understanding and Preventing Harm , 2019, IEEE Internet Computing.

[4]  B. Britt Curses Left and Right: Hate Speech and Biblical Tradition , 2010 .

[5]  Virgílio A. F. Almeida,et al.  Analyzing Right-wing YouTube Channels: Hate, Violence and Discrimination , 2018, WebSci.

[6]  J. Waldron The Conditions of Legitimacy: A Response to James Weinstein , 2017 .

[7]  W. Galston Value Pluralism and Liberal Political Theory , 1999, American Political Science Review.

[8]  Dolf Trieschnigg,et al.  Improving Cyberbullying Detection with User Context , 2013, ECIR.

[9]  Cornelia Caragea,et al.  Content-Driven Detection of Cyberbullying on the Instagram Social Network , 2016, IJCAI.

[10]  J. Butler Excitable Speech. A Politics of the Performative , 1997 .

[11]  Björn Gambäck,et al.  Using Convolutional Neural Networks to Classify Hate-Speech , 2017, ALW@ACL.

[12]  Ingmar Weber,et al.  Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.

[13]  Ishani Maitra,et al.  Speech and Harm: Controversies Over Free Speech , 2012 .

[14]  Imran Awan Cyber-Extremism: Isis and the Power of Social Media , 2017, Society.

[15]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[16]  D. McDowell Foreword , 1999 .

[17]  Mari J. Matsuda Public Response to Racist Speech: Considering the Victim''s Story , 1993 .

[18]  Mai ElSherief,et al.  Hierarchical CVAE for Fine-Grained Hate Speech Classification , 2018, EMNLP.

[19]  Huai Liu,et al.  Metamorphic Testing , 2018, ACM Comput. Surv..

[20]  Alexander Brown,et al.  Hate Speech Law: A Philosophical Examination , 2015, International Dialogue.

[21]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[22]  Joel R. Tetreault,et al.  Abusive Language Detection in Online User Content , 2016, WWW.

[23]  Druin Burch Two concepts of liberty , 2004, The Lancet.

[24]  Media Sport,et al.  Online harms white paper , 2019 .

[25]  Shivakant Mishra,et al.  Analyzing Labeled Cyberbullying Incidents on the Instagram Social Network , 2015, SocInfo.

[26]  Sérgio Nunes,et al.  A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..

[27]  James Weinstein,et al.  Hate Speech Bans, Democracy, and Political Legitimacy , 2017 .

[28]  Richard Delgado,et al.  The Harm in Hate Speech , 2013 .

[29]  M Curran,et al.  The Ethics of Information , 1991, The Journal of nursing administration.

[30]  Joseph E. Uscinski,et al.  A Web of Conspiracy? Internet and Conspiracy Theory , 2018, Handbook of Conspiracy Theory and Contemporary Religion.

[31]  M. Sindoni Direct hate speech vs. indirect fear speech. A multimodal critical discourse analysis of the Sun’s editorial "1 in 5 Brit Muslims’ sympathy for jihadis" , 2018 .

[32]  Katharine Gelber,et al.  Evidencing the harms of hate speech , 2016 .

[33]  R. Dworkin A New Map of Censorship , 1994 .

[34]  Jose Nazario,et al.  Defense and Detection Strategies against Internet Worms , 2003 .

[35]  Lei Gao,et al.  Recognizing Explicit and Implicit Hate Speech Using a Weakly Supervised Two-path Bootstrapping Approach , 2017, IJCNLP.

[36]  Eric Heinze,et al.  Hate Speech and Democratic Citizenship , 2016 .