TrollsWithOpinion: A Dataset for Predicting Domain-specific Opinion Manipulation in Troll Memes

Research into the classification of Image with Text (IWT) troll memes has recently become popular. Since the online community utilizes the refuge of memes to express themselves, there is an abundance of data in the form of memes. These memes have the potential to demean, harras, or bully targeted individuals. Moreover, the targeted individual could fall prey to opinion manipulation. To comprehend the use of memes in opinion manipulation, we define three specific domains (product, political or others) which we classify into troll or not-troll, with or without opinion manipulation. To enable this analysis, we enhanced an existing dataset by annotating the data with our defined classes, resulting in a dataset of 8,881 IWT or multimodal memes in the English language (TrollsWithOpinion dataset). We perform baseline experiments on the annotated dataset, and our result shows that existing state-of-the-art techniques could only reach a weighted-average F1-score of 0.37. This shows the need for a development of a specific technique to deal with multimodal troll memes.

[1]  Preslav Nakov,et al.  SemEval-2019 Task 6: Identifying and Categorizing Offensive Language in Social Media (OffensEval) , 2019, *SEMEVAL.

[2]  V. S. Subrahmanian,et al.  An Army of Me: Sockpuppets in Online Discussion Communities , 2017, WWW.

[3]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[4]  Preslav Nakov,et al.  We Built a Fake News / Click Bait Filter: What Happened Next Will Blow Your Mind! , 2017, RANLP.

[5]  Yuhao Du,et al.  Understanding Visual Memes: An Empirical Analysis of Text Superimposed on Memes Shared on Twitter , 2020, ICWSM.

[6]  O. V. Syuntyurenko Network technologies for information warfare and manipulation of public opinion , 2015, Scientific and Technical Information Processing.

[7]  Harry Zhang,et al.  Exploring Conditions For The Optimality Of Naïve Bayes , 2005, Int. J. Pattern Recognit. Artif. Intell..

[8]  Douwe Kiela,et al.  Supervised Multimodal Bitransformers for Classifying Images and Text , 2019, ViGIL@NeurIPS.

[9]  Preslav Nakov,et al.  Finding Opinion Manipulation Trolls in News Community Forums , 2015, CoNLL.

[10]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[11]  Luis Gerardo Mojica Modeling Trolling in Social Media Conversations , 2016, LREC.

[12]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[13]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[14]  Michael Wiegand,et al.  A Survey on Hate Speech Detection using Natural Language Processing , 2017, SocialNLP@EACL.

[15]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Jonathan Bishop,et al.  Representations of 'trolls' in mass media communication: a review of media-texts and moral panics relating to 'internet trolling' , 2014, Int. J. Web Based Communities.

[17]  Thomas Wolf,et al.  DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter , 2019, ArXiv.

[18]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[19]  Gianluca Stringhini,et al.  Screenshot Classifier annotated images pHashes of non-screenshot annotated images Know Your Meme Generic Annotation Sites Meme Annotation Sites Generic Web Communities , 2018 .

[20]  C. Hardaker,et al.  Trolling in asynchronous computer-mediated communication: From user discussions to academic definitions , 2010 .

[21]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[22]  Christian Bauckhage,et al.  Insights into Internet Memes , 2011, ICWSM.

[23]  Preslav Nakov,et al.  Predicting the Role of Political Trolls in Social Media , 2019, CoNLL.

[24]  Agostino Poggi,et al.  A Survey on Troll Detection , 2020, Future Internet.

[25]  Chih-Jen Lin,et al.  LIBLINEAR: A Library for Large Linear Classification , 2008, J. Mach. Learn. Res..

[26]  Klaus Krippendorff,et al.  Computing Krippendorff's Alpha-Reliability , 2011 .

[27]  Bernhard E. Boser,et al.  A training algorithm for optimal margin classifiers , 1992, COLT '92.