A Comparison of Machine Learning Approaches for Detecting Misogynistic Speech in Urban Dictionary

Recent moves to consider misogyny as a hate crime have refocused efforts for owners of web properties to detect and remove misogynistic speech. This paper considers the use of deep learning techniques for detection of misogyny in Urban Dictionary, a crowdsourced online dictionary for slang words and phrases. We compare the performance of two deep learning techniques, Bi-LSTM and Bi-GRU, to detect misogynistic speech with the performance of more conventional machine learning techniques, logistic regression, Naive-Bayes classification, and Random Forest classification. We find that both deep learning techniques examined have greater accuracy in detecting misogyny in the Urban Dictionary than the other techniques examined.

[1]  Mona Simion,et al.  Down Girl: The Logic of Misogyny , 2020, The Philosophical Quarterly.

[2]  Joydeep Ghosh,et al.  Investigation of the random forest framework for classification of hyperspectral data , 2005, IEEE Transactions on Geoscience and Remote Sensing.

[3]  Parisa Rashidi,et al.  Deep EHR: A Survey of Recent Advances in Deep Learning Techniques for Electronic Health Record (EHR) Analysis , 2017, IEEE Journal of Biomedical and Health Informatics.

[4]  Allan G. Johnson,et al.  The Blackwell Dictionary of Sociology: A User's Guide to Sociological Language , 1995 .

[5]  Jean Burgess,et al.  Mapping sociocultural controversies across digital media platforms: one week of #gamergate on Twitter, YouTube, and Tumblr , 2016 .

[6]  Alexander Brown,et al.  What is hate speech? Part 1: The Myth of Hate , 2017 .

[7]  Gongjian Wen,et al.  A deep neural network for real-time detection of falling humans in naturally occurring scenes , 2017, Neurocomputing.

[8]  David Madigan,et al.  Large-Scale Bayesian Logistic Regression for Text Categorization , 2007, Technometrics.

[9]  Sahil Shah,et al.  Predicting stock and stock price index movement using Trend Deterministic Data Preparation and machine learning techniques , 2015, Expert Syst. Appl..

[10]  Martine De Cock,et al.  Detecting Hate Speech Against Women in English Tweets , 2018, EVALITA@CLiC-it.

[11]  Jamie Bartlett,et al.  Misogyny on Twitter , 2014 .

[12]  Louise Richardson-Self,et al.  Woman‐Hating: On Misogyny, Sexism, and Hate Speech , 2018, Hypatia.

[13]  Judith Kelner,et al.  Analyzing the availability and performance of an e-health system integrated with edge, fog and cloud infrastructures , 2018, Journal of Cloud Computing.

[14]  Ruqiang Yan,et al.  Learning to Monitor Machine Health with Convolutional Bi-Directional LSTM Networks , 2017, Sensors.

[15]  Yoonsuh Jung Multiple predicting K-fold cross-validation for model selection , 2018 .

[16]  Simona Frenda The role of sarcasm in hate speech.A multilingual perspective , 2018 .

[17]  Gongjian Wen,et al.  Early event detection based on dynamic images of surveillance videos , 2018, J. Vis. Commun. Image Represent..

[18]  Shourya Roy,et al.  Fast and accurate text classification via multiple linear discriminant projections , 2003, The VLDB Journal.

[19]  Heather Butlin,et al.  What is hate crime , 2016 .

[20]  Shane McIntosh,et al.  An Empirical Comparison of Model Validation Techniques for Defect Prediction Models , 2017, IEEE Transactions on Software Engineering.