论文信息 - Improving Hate Speech Detection with Deep Learning Ensembles - 字舞流文

Improving Hate Speech Detection with Deep Learning Ensembles

Hate speech has become a major issue that is currently a hot topic in the domain of social media. Simultaneously, current proposed methods to address the issue raise concerns about censorship. Broadly speaking, our research focus is the area human rights, including the development of new methods to identify and better address discrimination while protecting freedom of expression. As neural network approaches are becoming state of the art for text classification problems, an ensemble method is adapted for usage with neural networks and is presented to better classify hate speech. Our method utilizes a publicly available embedding model, which is tested against a hate speech corpus from Twitter. To confirm robustness of our results, we additionally test against a popular sentiment dataset. Given our goal, we are pleased that our method has a nearly 5 point improvement in F-measure when compared to original work on a publicly available hate speech evaluation dataset. We also note difficulties encountered with reproducibility of deep learning methods and comparison of findings from other work. Based on our experience, more details are needed in published work reliant on deep learning methods, with additional evaluation information a consideration too. This information is provided to foster discussion within the research community for future work.

Udo Kruschwitz | Chris Fox | Steven Zimmerman | Udo Kruschwitz | C. Fox | Steven Zimmerman

[1] Georgios Balikas,et al. TwiSE at SemEval-2016 Task 4: Twitter Sentiment Classification , 2016, *SEMEVAL.

[2] Walter Daelemans,et al. A Dictionary-based Approach to Racism Detection in Dutch Social Media , 2016, ArXiv.

[3] Yoon Kim,et al. Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[4] Wesley De Neve,et al. Multimedia Lab @ ACL WNUT NER Shared Task: Named Entity Recognition for Twitter Microposts using Distributed Word Representations , 2015, NUT@IJCNLP.

[5] Dirk Hovy,et al. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.

[6] Gjorgji Madjarov,et al. Twitter Sentiment Analysis Using Deep Convolutional Neural Network , 2015, HAIS.

[7] Martín Abadi,et al. TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems , 2016, ArXiv.

[8] Matthias Hagen,et al. Webis: An Ensemble for Twitter Sentiment Detection , 2015, *SEMEVAL.

[9] Pascale Fung,et al. One-step and Two-step Classification for Abusive Language Detection on Twitter , 2017, ALW@ACL.

[10] F. Molteni,et al. The ECMWF Ensemble Prediction System: Methodology and validation , 1996 .

[11] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[12] Vasudeva Varma,et al. Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.

[13] Yoav Goldberg,et al. A Primer on Neural Network Models for Natural Language Processing , 2015, J. Artif. Intell. Res..

[14] Yi Yang,et al. Overcoming Language Variation in Sentiment Analysis with Social Attention , 2015, TACL.

[15] Alessandro Moschitti,et al. Twitter Sentiment Analysis with Deep Convolutional Neural Networks , 2015, SIGIR.

[16] Guigang Zhang,et al. Deep Learning , 2016, Int. J. Semantic Comput..

[17] Björn Gambäck,et al. Using Convolutional Neural Networks to Classify Hate-Speech , 2017, ALW@ACL.

[18] Rui Xia,et al. Ensemble of feature sets and classification algorithms for sentiment classification , 2011, Inf. Sci..

[19] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[20] Udo Kruschwitz,et al. Speaking of the weather: Detection of meteorological influences on sentiment within social media , 2017, 2017 9th Computer Science and Electronic Engineering (CEEC).

[21] Björn Ross,et al. Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.

[22] Norbert Fuhr,et al. Some Common Mistakes In IR Evaluation, And How They Can Be Avoided , 2018, SIGIR Forum.

[23] Soroush Vosoughi,et al. Tweet2Vec: Learning Tweet Embeddings Using Character-level CNN-LSTM Encoder-Decoder , 2016, SIGIR.

[24] Derek Ruths,et al. A Web of Hate: Tackling Hateful Speech in Online Social Spaces , 2017, ArXiv.

[25] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.

[26] Jeffrey Pennington,et al. GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[27] Gaël Varoquaux,et al. Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[28] Preslav Nakov,et al. SemEval-2013 Task 2: Sentiment Analysis in Twitter , 2013, *SEMEVAL.

[29] Charu C. Aggarwal,et al. A Survey of Text Classification Algorithms , 2012, Mining Text Data.

[30] J. Morsink,et al. The Universal Declaration of Human Rights: Origins, Drafting, and Intent , 1999 .

[31] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[32] Raphaël Troncy,et al. Sentiment Polarity Detection from Amazon Reviews: An Experimental Study , 2016, SemWebEval@ESWC.