A Lexicon-based Approach for Hate Speech Detection

We explore the idea of creating a classifier that can be used to detect presence of hate speech in web discourses such as web forums and blogs. In this work, hate speech problem is abstracted into three main thematic areas of race, nationality and religion. The goal of our research is to create a model classifier that uses sentiment analysis techniques and in particular subjectivity detection to not only detect that a given sentence is subjective but also to identify and rate the polarity of sentiment expressions. We begin by whittling down the document size by removing objective sentences. Then, using subjectivity and semantic features related to hate speech, we create a lexicon that is employed to build a classifier for hate speech detection. Experiments with a hate corpus show significant practical application for a real-world web discourse.

[1]  G. G. Stokes "J." , 1890, The New Yale Book of Quotations.

[2]  George A. Miller,et al.  Introduction to WordNet: An On-line Lexical Database , 1990 .

[3]  Ellen Spertus,et al.  Smokey: Automatic Recognition of Hostile Messages , 1997, AAAI/IAAI.

[4]  Vasileios Hatzivassiloglou,et al.  Predicting the Semantic Orientation of Adjectives , 1997, ACL.

[5]  Ellen Riloff,et al.  Learning Dictionaries for Information Extraction by Multi-Level Bootstrapping , 1999, AAAI/IAAI.

[6]  Ellen Riloff,et al.  A Bootstrapping Method for Learning Semantic Lexicons using Extraction Pattern Contexts , 2002, EMNLP.

[7]  Ellen Riloff,et al.  Learning Extraction Patterns for Subjective Expressions , 2003, EMNLP.

[8]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[9]  Janyce Wiebe,et al.  Just How Mad Are You? Finding Strong and Weak Opinion Clauses , 2004, AAAI.

[10]  Ellen Riloff,et al.  Creating Subjective and Objective Sentence Classifiers from Unannotated Texts , 2005, CICLing.

[11]  Janyce Wiebe,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2005, HLT.

[12]  Hsinchun Chen,et al.  US domestic extremist groups on the Web: link and content analysis , 2005, IEEE Intelligent Systems.

[13]  Maite Taboada,et al.  Methods for Creating Semantic Orientation Dictionaries , 2006, LREC.

[14]  Andrea Esuli,et al.  SENTIWORDNET: A Publicly Available Lexical Resource for Opinion Mining , 2006, LREC.

[15]  Jennifer Jie Xu,et al.  Mining communities and their relationships in blogs: A study of online hate groups , 2007, Int. J. Hum. Comput. Stud..

[16]  Lillian Lee,et al.  Opinion Mining and Sentiment Analysis , 2008, Found. Trends Inf. Retr..

[17]  Ahmed Abbasi,et al.  Affect Intensity Analysis of Dark Web Forums , 2007, 2007 IEEE Intelligence and Security Informatics.

[18]  Songbo Tan,et al.  Combining learn-based and lexicon-based techniques for sentiment detection without using labeled examples , 2008, SIGIR '08.

[19]  Philip S. Yu,et al.  A holistic lexicon-based approach to opinion mining , 2008, WSDM '08.

[20]  Iadh Ounis,et al.  Overview of the TREC 2008 Blog Track , 2008, TREC.

[21]  Hsinchun Chen,et al.  Sentiment analysis in multiple languages: Feature selection for opinion classification in Web forums , 2008, TOIS.

[22]  Iadh Ounis,et al.  Overview of the TREC-2009 Blog Track | NIST , 2008 .

[23]  Christopher D. Manning,et al.  Stanford typed dependencies manual , 2010 .

[24]  Hsinchun Chen,et al.  A Lexicon-Enhanced Method for Sentiment Classification: An Experiment on Online Product Reviews , 2010, IEEE Intelligent Systems.

[25]  Zhi Xu,et al.  Filtering Offensive Language in Online Communities using Grammatical Relations , 2010 .

[26]  Raphael Cohen-Almagor,et al.  Fighting Hate and Bigotry on the Internet , 2011 .

[27]  Julia Hirschberg,et al.  Detecting Hate Speech on the World Wide Web , 2012 .

[28]  I-Hsien Ting,et al.  Content matters: A study of hate groups detection based on social networks analysis and web mining , 2013, 2013 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM 2013).

[29]  Yuzhou Wang,et al.  Locate the Hate: Detecting Tweets against Blacks , 2013, AAAI.

[30]  Foreword and Editorial International Journal of Multimedia and Ubiquitous Engineering , 2017 .