Improving Moderation of Online Discussions via Interpretable Neural Models
暂无分享,去创建一个
Mária Bieliková | Marián Simko | Matús Pikuliak | Andrej Svec | M. Bieliková | Andrej Svec | Marián Simko | Matúš Pikuliak
[1] John Pavlopoulos,et al. Deeper Attention to Abusive User Content Moderation , 2017, EMNLP.
[2] Mohit Bansal,et al. Interpreting Neural Networks to Improve Politeness Comprehension , 2016, EMNLP.
[3] Jing Zhou,et al. Hate Speech Detection with Comment Embeddings , 2015, WWW.
[4] Jure Leskovec,et al. How Community Feedback Shapes User Behavior , 2014, ICWSM.
[5] Alessandro Moschitti,et al. Semi-supervised Question Retrieval with Gated Convolutions , 2015, NAACL.
[6] Vasudeva Varma,et al. Deep Learning for Hate Speech Detection in Tweets , 2017, WWW.
[7] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[8] Jure Leskovec,et al. Antisocial Behavior in Online Discussion Communities , 2015, ICWSM.
[9] Xinlei Chen,et al. Visualizing and Understanding Neural Models in NLP , 2015, NAACL.
[10] Paolo Rosso,et al. Wikipedia Vandalism Detection: Combining Natural Language, Metadata, and Reputation Features , 2011, CICLing.
[11] Daniel Jurafsky,et al. Understanding Neural Networks through Representation Erasure , 2016, ArXiv.
[12] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.
[13] Klaus-Robert Müller,et al. "What is relevant in a text document?": An interpretable machine learning approach , 2016, PloS one.
[14] Cornelia Caragea,et al. Content-Driven Detection of Cyberbullying on the Instagram Social Network , 2016, IJCAI.
[15] Tomas Mikolov,et al. Enriching Word Vectors with Subword Information , 2016, TACL.
[16] Njagi Dennis Gitari,et al. A Lexicon-based Approach for Hate Speech Detection , 2015, MUE 2015.
[17] Bin Yu,et al. Beyond Word Importance: Contextual Decomposition to Extract Interactions from LSTMs , 2018, ICLR.
[18] Joel R. Tetreault,et al. Do Characters Abuse More Than Words? , 2016, SIGDIAL Conference.
[19] Virgílio A. F. Almeida,et al. "Like Sheep Among Wolves": Characterizing Hateful Users on Twitter , 2017, ArXiv.
[20] Diyi Yang,et al. Hierarchical Attention Networks for Document Classification , 2016, NAACL.
[21] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.
[22] Virgílio A. F. Almeida,et al. Characterizing and Detecting Hateful Users on Twitter , 2018, ICWSM.
[23] Fei-Fei Li,et al. Visualizing and Understanding Recurrent Networks , 2015, ArXiv.
[24] Pete Burnap,et al. Us and them: identifying cyber hate on Twitter across multiple protected characteristics , 2016, EPJ Data Science.
[25] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.
[26] R. J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[27] Regina Barzilay,et al. Rationalizing Neural Predictions , 2016, EMNLP.