论文信息 - Investigations in Computational Sarcasm

Investigations in Computational Sarcasm

Sarcasm is verbal irony that is intended to mock or ridicule. Existing sentiment analysis systems show a degraded performance in case of sarcastic text. Hence, computational sarcasm has received attention from the sentiment analysis community. Computational sarcasm refers to computational techniques that deal with sarcastic text. This thesis presents our investigations in computational sarcasm based on the linguistic notion of incongruity. For example, the sentence `I love being ignored' is sarcastic because the positive word `love' is incongruous with the negative phrase `being ignored'. These investigations are divided into three parts: understanding the phenomenon of sarcasm, sarcasm detection and sarcasm generation. To first understand the phenomenon of sarcasm, we consider two components of sarcasm: implied negative sentiment, and presence of a target. To understand how implied negative sentiment plays a role in sarcasm understanding, we present an annotation study which evaluates the quality of a sarcasm-labeled dataset created by non-native annotators. Following this, in order to show how the target of sarcasm is important to understand sarcasm, we first describe an annotation study which highlights the challenges in distinguishing between sarcasm and irony (since irony does not have a target while sarcasm does), and then present a computational approach that extracts the target of a sarcastic text. We then present our approaches for sarcasm detection. To detect sarcasm, we capture incongruity in two ways: `intra-textual incongruity' where we look at the incongruity within the text to be classified (i.e., target text), and the `context incongruity' where we incorporate information outside the target text. To detect incongruity within the target text, we present four approaches: (a) A classifier that captures sentiment incongruity using sentiment-based features (as in the case of `I love being ignored'), (b) A classifier that captures semantic incongruity (as in the case of `A woman needs a man like a fish needs bicycle') using word embedding-based features, (c) A topic model that captures sentiment incongruity using sentiment distributions in the text (in order to discover sarcasm-prevalent topics such as work, college, etc.), and (d) An approach that captures incongruity in the language model using sentence completion. The approaches in (a) and (c) incorporate sentiment incongruity relying on sentiment-bearing words, whereas approach in (b) and (d) tackle other forms of incongruity where sentiment-bearing words may not be present. On the other hand, to detect sarcasm using contextual incongruity, we describe two approaches: (a) A rule-based approach that uses historical text by an author to detect sarcasm in the text generated by them, and (b) A statistical approach that uses sequence labeling techniques for sarcasm detection in dialogue. The approach in (a) attempts to detect sarcasm that requires author-specific context while that in (b) attempts to detect sarcasm that requires conversation-specific context. Finally, we present an technique for sarcasm generation. In this case, we use a template-based approach to synthesize incongruity and generate a sarcastic response to user input. Our investigations demonstrate how evidences of incongruity (such as sentiment incongruity, semantic incongruity, etc.) can be modeled using different learning techniques (such as classifiers, topic models, etc.) for sarcasm detection and sarcasm generation. In addition, our findings establish the promise of novel problems like sarcasm target identification and sarcasm versus irony classification, and provide insights for future research in sarcasm detection.

Pushpak Bhattacharyya | Aditya Joshi | Mark J. Carman | Aditya Joshi | P. Bhattacharyya