Directions in abusive language training data, a systematic review: Garbage in, garbage out
暂无分享,去创建一个
[1] I. Shapiro. Problems, Methods, and Theories in the Study of Politics, or What's Wrong with Political Science and What to Do About it , 2002 .
[2] G. Smith,et al. Bias in meta-analysis detected by a simple, graphical test , 1997, BMJ.
[3] Gianluca Stringhini,et al. Measuring #GamerGate: A Tale of Hate, Sexism, and Bullying , 2017, WWW.
[4] Ona de Gibert,et al. Hate Speech Dataset from a White Supremacy Forum , 2018, ALW.
[5] Michael C. Frank,et al. Data availability, reusability, and analytic reproducibility: evaluating the impact of a mandatory open data policy at the journal Cognition , 2018, Royal Society Open Science.
[6] Preslav Nakov,et al. Predicting the Type and Target of Offensive Posts in Social Media , 2019, NAACL.
[7] L. Lachenicht. Aggravating language a study of abusive and insulting language , 1980 .
[8] Radhika Mamidi,et al. When does a compliment become sexist? Analysis and classification of ambivalent sexism using twitter data , 2017, NLP+CSS@ACL.
[9] Kalina Bontcheva,et al. Broad Twitter Corpus: A Diverse Named Entity Recognition Resource , 2016, COLING.
[10] Animesh Mukherjee,et al. Spread of Hate Speech in Online Social Media , 2018, WebSci.
[11] Siân Brooke,et al. "There are no girls on the Internet": Gender performances in Advice Animal memes , 2019, First Monday.
[12] Marco Guerini,et al. CONAN - COunter NArratives through Nichesourcing: a Multilingual Dataset of Responses to Fight Online Hate Speech , 2019, ACL.
[13] Gabriela Ferraro,et al. Transfer learning for hate speech detection in social media , 2019, Journal of Computational Social Science.
[14] Gianluca Stringhini,et al. Hate is not Binary: Studying Abusive Behavior of #GamerGate on Twitter , 2017, HT.
[15] Jane Suiter,et al. Post-truth Politics , 2016 .
[16] Sarah Myers West,et al. Censored, suspended, shadowbanned: User interpretations of content moderation on social media platforms , 2018, New Media Soc..
[17] Ralf Peters,et al. Detecting Offensive Statements towards Foreigners in Social Media , 2017, HICSS.
[18] Kalina Bontcheva,et al. Corpus Annotation through Crowdsourcing: Towards Best Practice Guidelines , 2014, LREC.
[19] Raphaël Troncy,et al. Analysis of named entity recognition and linking for tweets , 2014, Inf. Process. Manag..
[20] Kalina Bontcheva,et al. The GATE Crowdsourcing Plugin: Crowdsourcing Annotated Corpora Made Easy , 2014, EACL.
[21] WaldoJim,et al. Privacy, anonymity, and big data in the social sciences , 2014 .
[22] Gianluca Stringhini,et al. Large Scale Crowdsourcing and Characterization of Twitter Abusive Behavior , 2018, ICWSM.
[23] Julia Hirschberg,et al. Detecting Hate Speech on the World Wide Web , 2012 .
[24] Anne-Wil Harzing,et al. Google Scholar, Scopus and the Web of Science: a longitudinal and cross-disciplinary comparison , 2015, Scientometrics.
[25] Alice E. Marwick,et al. Online Harassment, Defamation, and Hateful Speech: A Primer of the Legal Landscape , 2014 .
[26] Cristina Bosco,et al. An Italian Twitter Corpus of Hate Speech against Immigrants , 2018, LREC.
[27] Hugo Jair Escalante,et al. Overview of MEX-A3T at IberLEF 2019: Authorship and Aggressiveness Analysis in Mexican Spanish Tweets , 2018, IberLEF@SEPLN.
[28] Gianluca Stringhini,et al. What is Gab: A Bastion of Free Speech or an Alt-Right Echo Chamber , 2018, WWW.
[29] Cody Buntain,et al. A Large Labeled Corpus for Online Harassment Research , 2017, WebSci.
[30] Yangqiu Song,et al. Multilingual and Multi-Aspect Hate Speech Analysis , 2019, EMNLP.
[31] Timnit Gebru,et al. Lessons from archives: strategies for collecting sociocultural data in machine learning , 2019, FAT*.
[32] Zeerak Waseem,et al. Are You a Racist or Am I Seeing Things? Annotator Influence on Hate Speech Detection on Twitter , 2016, NLP+CSS@EMNLP.
[33] M. Taddeo. Data philanthropy and the design of the infraethics for information societies , 2016, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences.
[34] Douwe Kiela,et al. The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes , 2020, NeurIPS.
[35] Alexander van Deursen,et al. The digital divide shifts to differences in usage , 2014, New Media Soc..
[36] Ritesh Kumar,et al. Aggression-annotated Corpus of Hindi-English Code-mixed Data , 2018, LREC.
[37] Tarleton Gillespie,et al. Content moderation, AI, and the question of scale , 2020, Big Data Soc..
[38] David Reitter,et al. Crowdsourcing the Measurement of Interstate Conflict , 2016, PloS one.
[39] Gianluca Stringhini,et al. Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities , 2018 .
[40] J. Bohannon. Human subject research. Social science for pennies. , 2011, Science.
[41] Iryna Gurevych,et al. Text Processing Like Humans Do: Visually Attacking and Shielding NLP Systems , 2019, NAACL.
[42] Michael Wiegand,et al. Detection of Abusive Language: the Problem of Biased Datasets , 2019, NAACL.
[43] Sérgio Nunes,et al. A Hierarchically-Labeled Portuguese Hate Speech Dataset , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[44] Ingmar Weber,et al. Racial Bias in Hate Speech and Abusive Language Detection Datasets , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[45] Scott A. Hale,et al. Challenges and frontiers in abusive content detection , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[46] James Goulding,et al. Psychology of personal data donation , 2019, PloS one.
[47] Michael Veale,et al. Like Trainer, Like Bot? Inheritance of Bias in Algorithmic Content Moderation , 2017, SocInfo.
[48] Michael Wiegand,et al. A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.
[49] B. Jansen,et al. Mapping online hate: A scientometric analysis on research trends and hotspots in research on online hate , 2019, PloS one.
[50] Sophie Ritson,et al. ‘Crackpots’ and ‘active researchers’: The controversy over links between arXiv and the scientific blogosphere , 2016, Social studies of science.
[51] Justin Reich,et al. Privacy, anonymity, and big data in the social sciences , 2014, Commun. ACM.
[52] Virgílio A. F. Almeida,et al. Characterizing and Detecting Hateful Users on Twitter , 2018, ICWSM.
[53] Sylvie Delacroix,et al. Bottom-Up Data Trusts: Disturbing the ‘One Size Fits All’ Approach to Data Governance , 2018, International Data Privacy Law.
[54] Lucas Dixon,et al. Ex Machina: Personal Attacks Seen at Scale , 2016, WWW.
[55] Bernard J. Jansen,et al. Developing an online hate classifier for multiple social media platforms , 2020, Human-centric Computing and Information Sciences.
[56] Ankur Taly,et al. Counterfactual Fairness in Text Classification through Robustness , 2018, AIES.
[57] James Pustejovsky,et al. Natural Language Annotation for Machine Learning - a Guide to Corpus-Building for Applications , 2012 .
[58] Dragomir R. Radev,et al. The ACL Anthology Reference Corpus: A Reference Dataset for Bibliographic Research in Computational Linguistics , 2008, LREC.
[59] David Jurgens,et al. A Just and Comprehensive Strategy for Using NLP to Address Online Abuse , 2019, ACL.
[60] Jing Qian,et al. A Benchmark Dataset for Learning to Intervene in Online Hate Speech , 2019, EMNLP.
[61] Wendy Hall,et al. Growing the artificial intelligence industry in the UK , 2017 .
[62] Nabiha Aziz. Dog Whistles and Discriminatory Intent: Proving Intent Through Campaign Speech in Voting Rights Litigation , 2019 .
[63] Leon Derczynski,et al. Offensive Language and Hate Speech Detection for Danish , 2019, LREC.
[64] Yejin Choi,et al. The Risk of Racial Bias in Hate Speech Detection , 2019, ACL.
[65] M. Williams,et al. Cyber-hate on social media in the aftermath of Woolwich , 2015 .
[66] Ingmar Weber,et al. Automated Hate Speech Detection and the Problem of Offensive Language , 2017, ICWSM.
[67] A. V. van Deursen,et al. The digital divide shifts to differences in usage , 2014 .
[68] Philip M. Davis,et al. Does the arXiv lead to higher citations and reduced publisher downloads for mathematics articles? , 2006, Scientometrics.
[69] Pete Burnap,et al. Us and them: identifying cyber hate on Twitter across multiple protected characteristics , 2016, EPJ Data Science.
[70] James Davis,et al. Evaluating and improving the usability of Mechanical Turk for low-income workers in India , 2010, ACM DEV '10.
[71] Lifeng Lin,et al. Quantifying publication bias in meta‐analysis , 2018, Biometrics.
[72] Alan Macfarlane,et al. Social , 1994, Schizophrenia Research.
[73] E. Edmonds. The New ABCs of Research: Achieving Breakthrough Collaborations , 2017, Leonardo.
[74] Emily M. Bender,et al. Data Statements for Natural Language Processing: Toward Mitigating System Bias and Enabling Better Science , 2018, TACL.
[75] Gianluca Stringhini,et al. Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan's Politically Incorrect Forum and Its Effects on the Web , 2016, ICWSM.
[76] Reuben Binns,et al. Algorithmic content moderation: Technical and political challenges in the automation of platform governance , 2020, Big Data Soc..
[77] Mehmet Fatih Çömlekçi. Custodians of the Internet: Platforms, Content Moderation, and the Hidden Decisions that Shape Social Media , 2019 .
[78] Nathan Schneider,et al. Association for Computational Linguistics: Human Language Technologies , 2011 .
[79] Benjamin E. Lauderdale,et al. Crowd-sourced Text Analysis: Reproducible and Agile Production of Political Data , 2016, American Political Science Review.
[80] Sara Tonelli,et al. Creating a WhatsApp Dataset to Study Pre-teen Cyberbullying , 2018, ALW.
[81] Bernard J. Jansen,et al. Online Hate Interpretation Varies by Country, But More by Individual: A Statistical Analysis Using Crowdsourced Ratings , 2018, 2018 Fifth International Conference on Social Networks Analysis, Management and Security (SNAMS).
[82] Grant Blank. The Digital Divide Among Twitter Users and Its Implications for Social Research , 2017 .
[83] Rahul Goel,et al. Detecting Offensive Content in Open-domain Conversations using Two Stage Semi-supervision , 2018, ArXiv.
[84] Reut Tsarfaty,et al. Evaluating NLP Models via Contrast Sets , 2020, ArXiv.
[85] Lei Gao,et al. Detecting Online Hate Speech Using Context Aware Models , 2017, RANLP.
[86] John P. A. Ioannidis,et al. A manifesto for reproducible science , 2017, Nature Human Behaviour.
[87] Yarin Gal,et al. BatchBALD: Efficient and Diverse Batch Acquisition for Deep Bayesian Active Learning , 2019, NeurIPS.
[88] Indra Budi,et al. Multi-label Hate Speech and Abusive Language Detection in Indonesian Twitter , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[89] P. Glasziou,et al. Bias in meta-analysis detected by a simple, graphical test. Graphical test is itself biased. , 1998, BMJ.
[90] Ika Alfina,et al. Hate speech detection in the Indonesian language: A dataset and preliminary study , 2017, 2017 International Conference on Advanced Computer Science and Information Systems (ICACSIS).
[91] Shanmughapriya,et al. JIGSAW MULTILINGUAL TOXIC COMMENT CLASSIFICATION , 2022 .
[92] M. Williams,et al. Towards an Ethical Framework for Publishing Twitter Data in Social Research: Taking into Account Users’ Views, Online Context and Algorithmic Estimation , 2017, Sociology.
[93] Christo Wilson,et al. Reasoning about Political Bias in Content Moderation , 2020, AAAI.
[94] Taha Yasseri,et al. A Biased Review of Biases in Twitter Studies on Political Collective Action , 2016, Front. Phys..
[95] Paolo Rosso,et al. Overview of the Task on Automatic Misogyny Identification at IberEval 2018 , 2018, IberEval@SEPLN.
[96] Matthew K. O. Lee,et al. Online social networks: Why do students use facebook? , 2011, Comput. Hum. Behav..
[97] Daniel Matthew Cer,et al. Language-agnostic BERT Sentence Embedding , 2020, ACL.
[98] Justin Grimmer,et al. Text as Data: The Promise and Pitfalls of Automatic Content Analysis Methods for Political Texts , 2013, Political Analysis.
[99] Imran Awan,et al. We fear for our lives : offline and online experiences of anti-Muslim hostility , 2015 .
[100] N. L. Vuong,et al. Quality of flow diagram in systematic review and/or meta-analysis , 2018, PloS one.
[101] Diana Maynard,et al. Who cares about Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis. , 2014, LREC.
[102] Fiorenzo Franceschini,et al. Do Scopus and WoS correct “old” omitted citations? , 2016, Scientometrics.
[103] Manish Shrivastava,et al. Degree based Classification of Harmful Speech using Twitter Data , 2018, TRAC@COLING 2018.
[104] Ingmar Weber,et al. Understanding Abuse: A Typology of Abusive Language Detection Subtasks , 2017, ALW@ACL.
[105] Andrew Kehoe,et al. . A corpus linguistic approach to the identification of swearing in computer mediated communication , 2017 .
[106] Nikola S. Nikolov,et al. Dataset Construction for the Detection of Anti-Social Behaviour in Online Communication in Arabic , 2018, ACLING.
[107] K. Bretonnel Cohen,et al. Last Words: Amazon Mechanical Turk: Gold Mine or Coal Mine? , 2011, CL.
[108] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.
[109] J. McGowan,et al. PRISMA Extension for Scoping Reviews (PRISMA-ScR): Checklist and Explanation , 2018, Annals of Internal Medicine.
[110] George Bravos,et al. Online Appendix to : Understanding Human-Machine Networks : A Cross-Disciplinary Survey , 2017 .
[111] Scott A. Hale,et al. Political Turbulence: How Social Media Shape Collective Action , 2015 .
[112] Preslav Nakov,et al. SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.
[113] Tomaž Erjavec,et al. Datasets of Slovene and Croatian Moderated News Comments , 2018, ALW.
[114] Jill P Mesirov,et al. Accessible Reproducible Research , 2010, Science.
[115] I. Shapiro. Problems, Methods, and Theories in the Study of Politics, or What's Wrong with Political Science and What to Do About it , 2002 .
[116] Stan Matwin,et al. Boosting Text Classification Performance on Sexist Tweets by Text Augmentation and Text Generation Using a Combination of Knowledge Graphs , 2018, ALW.
[117] Soon-Gyo Jung,et al. Topic-driven toxicity: Exploring the relationship between online toxicity and news topics , 2020, PloS one.
[118] Dirk Hovy,et al. Hateful Symbols or Hateful People? Predictive Features for Hate Speech Detection on Twitter , 2016, NAACL.
[119] Amit P. Sheth,et al. A Quality Type-aware Annotated Corpus and Lexicon for Harassment Research , 2018, WebSci.
[120] N. Strossen. HATE: Why We Should Resist it With Free Speech, Not Censorship , 2018 .
[121] Naganna Chetty,et al. Hate speech review in the context of online social networks , 2018 .
[122] Rogers Prates de Pelle,et al. Offensive Comments in the Brazilian Web: a dataset and baseline results , 2017 .
[123] A. Kenny. Freewill and Responsibility (Routledge Revivals) , 2011 .
[124] Joel R. Tetreault,et al. Abusive Language Detection in Online User Content , 2016, WWW.
[125] D. Moher,et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. , 2010, International journal of surgery.
[126] Ralf Peters,et al. Detecting Cyberbullying in Online Communities , 2016, ECIS.
[127] Lluis Gomez,et al. Exploring Hate Speech Detection in Multimodal Publications , 2019, 2020 IEEE Winter Conference on Applications of Computer Vision (WACV).
[128] John Pavlopoulos,et al. Deeper Attention to Abusive User Content Moderation , 2017, EMNLP.
[129] Xiaochang Peng,et al. Exploring Deep Multimodal Fusion of Text and Photo for Hate Speech Classification , 2019, Proceedings of the Third Workshop on Abusive Language Online.
[130] Casey Fiesler,et al. “Participant” Perceptions of Twitter Research Ethics , 2018 .
[131] Sérgio Nunes,et al. A Survey on Automatic Detection of Hate Speech in Text , 2018, ACM Comput. Surv..
[132] Franz J. Király,et al. Design choices for productive, secure, data-intensive research at scale in the cloud , 2019, ArXiv.
[133] Jonathan Mellon,et al. Twitter and Facebook are not representative of the general population: Political attitudes and demographics of British social media users , 2017 .
[134] Björn Ross,et al. Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis , 2016, ArXiv.
[135] Giovanni Vigna,et al. Peer to Peer Hate: Hate Speech Instigators and Their Targets , 2018, ICWSM.
[136] Sarah Myers West. Censored, suspended, shadowbanned: User interpretations of content moderation on social media platforms , 2018 .