论文信息 - Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

Class-based Prediction Errors to Detect Hate Speech with Out-of-vocabulary Words

Common approaches to text categorization essentially rely either on n-gram counts or on word embeddings. This presents important difficulties in highly dynamic or quickly-interacting environments, where the appearance of new words and/or varied misspellings is the norm. A paradigmatic example of this situation is abusive online behavior, with social networks and media platforms struggling to effectively combat uncommon or nonblacklisted hate words. To better deal with these issues in those fast-paced environments, we propose using the error signal of class-based language models as input to text classification algorithms. In particular, we train a next-character prediction model for any given class, and then exploit the error of such class-based models to inform a neural network classifier. This way, we shift from the ability to describe seen documents to the ability to predict unseen content. Preliminary studies using out-of-vocabulary splits from abusive tweet data show promising results, outperforming competitive text categorization strategies by 4–11%

[1] Jian Sun,et al. Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[2] Joel R. Tetreault,et al. Do Characters Abuse More Than Words? , 2016, SIGDIAL Conference.

[3] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[4] Nitish Srivastava,et al. Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[5] Christopher D. Manning,et al. Baselines and Bigrams: Simple, Good Sentiment and Topic Classification , 2012, ACL.

[6] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.

[7] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[8] Jing Zhou,et al. Hate Speech Detection with Comment Embeddings , 2015, WWW.

[9] John Salvatier,et al. Theano: A Python framework for fast computation of mathematical expressions , 2016, ArXiv.

[10] Yoshua Bengio,et al. On the Properties of Neural Machine Translation: Encoder–Decoder Approaches , 2014, SSST@EMNLP.

[11] Ying Chen,et al. Detecting Offensive Language in Social Media to Protect Adolescent Online Safety , 2012, 2012 International Conference on Privacy, Security, Risk and Trust and 2012 International Confernece on Social Computing.

[12] Gianluca Stringhini,et al. Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan's Politically Incorrect Forum and Its Effects on the Web , 2016, ICWSM.

[13] Quoc V. Le,et al. Distributed Representations of Sentences and Documents , 2014, ICML.

[14] Nicholas Diakopoulos,et al. Cooooooooooooooollllllllllllll!!!!!!!!!!!!!! Using Word Lengthening to Detect Sentiment in Microblogs , 2011, EMNLP.

[15] Richard Socher,et al. Quasi-Recurrent Neural Networks , 2016, ICLR.

[16] Douglas A. Reynolds,et al. Speaker Verification Using Adapted Gaussian Mixture Models , 2000, Digit. Signal Process..

[17] Vysoké Učení,et al. Statistical Language Models Based on Neural Networks , 2012 .

[18] Eamonn J. Keogh,et al. Derivative Dynamic Time Warping , 2001, SDM.

[19] Animesh Mukherjee,et al. WASSUP? LOL : Characterizing Out-of-Vocabulary Words in Twitter , 2016, CSCW '16 Companion.

[20] Xavier Serra,et al. Predictability of Music Descriptor Time Series and its Application to Cover Song Detection , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[21] Sanjeev Arora,et al. A Simple but Tough-to-Beat Baseline for Sentence Embeddings , 2017, ICLR.