Sarcasm Detection in Politically Motivated Social Media Content

During the coronavirus pandemic, sarcasm was often used to insult, taunt, mock, and deride alternative points of views on social media platforms. Because sarcasm used in this form can promote hatred, incite violence and encourage people to abandon safety measures, it then becomes essential to separate such sarcastic content from volumes of social media feeds to understand the essence of what is being said. Prevalent sarcasm detection approaches that rely on hashtags such as #sarcasm or #irony for weak learning may not apply well because most people either do not use these hashtags in chaotic and passionate situations, or even worse sometimes they use these hashtags erroneously to tag non-sarcastic content. This paper proposes a comprehensive coding guide to label sarcastic social media content by integrating definitions from popular English language dictionaries. It applies this guide to label tweets collected following two antilockdown protests in Michigan. A suite of features that capture contextual, linguistic, social, sentiment, and auxiliary aspects are extracted from the labeled tweets. These features are used to train many common machine learning models, which can separate sarcastic from non-sarcastic tweets with a F1-score of 0.83. The approach is promising because it offers competitive accuracy in detecting sarcasm from data combined from two different contexts, without any guiding hashtags. Importance scores indicate that non-contextual features contribute about 57% to the detection, suggesting that sarcastic tendencies and their expressions may be innate to individuals, may transcend the context, and hence, offer clues for building portable classifiers for automated detection that are agnostic to the context.