Chinese Irony Corpus Construction and Ironic Structure Analysis

Non-literal expression recognition is a challenging task in natural language processing. An ironic expression implies the opposite of the literal meaning, causing problems in opinion mining and sentiment analysis. In this paper, ironic messages are collected from microblogs to form an irony corpus based on the use of emoticons, linguistic forms, and sentiment polarity. Five linguistic patterns are mined by using the proposed bootstrapping approach. We also analyze the linguistic structure and elements used to convey irony. Based on our observations, ironic words/phrases and contextual information are the necessary elements in irony, while the contextual information can be hidden in linguistic forms. A rhetorical element, which is optional in irony, can also be used to help strengthen the effects and understandability of an ironic expression. The ironic elements in each instance of our irony corpus are labelled based on this structure. This corpus can be used to study the usage of ironic expressions and the identification of ironic elements, and thus improve the performance of irony recognition.

[1]  Tony Veale,et al.  Detecting Ironic Intent in Creative Comparisons , 2010, ECAI.

[2]  Ofer Fein,et al.  Irony: Context and Salience , 1999 .

[3]  Philipp Koehn,et al.  Synthesis Lectures on Human Language Technologies , 2016 .

[4]  Fei Wang,et al.  Exploiting Discourse Relations for Sentiment Analysis , 2012, COLING.

[5]  Hsin-Hsi Chen,et al.  Analyses of the Association between Discourse Relation and Sentiment Polarity with a Chinese Human-Annotated Corpus , 2013, LAW@ACL.

[6]  Antal van den Bosch,et al.  The perfect solution for detecting sarcasm in tweets #not , 2013, WASSA@NAACL-HLT.

[7]  Hsin-Hsi Chen,et al.  Mining opinions from the Web: Beyond relevance retrieval , 2007, J. Assoc. Inf. Sci. Technol..

[8]  Elena Filatova,et al.  Irony and Sarcasm: Corpus Generation and Analysis Using Crowdsourcing , 2012, LREC.

[9]  Wei Gao,et al.  Unsupervised Discovery of Discourse Relations for Eliminating Intra-sentence Polarity Ambiguities , 2011, EMNLP.

[10]  Siobhan Chapman Logic and Conversation , 2005 .

[11]  Ari Rappoport,et al.  Semi-Supervised Recognition of Sarcasm in Twitter and Amazon , 2010, CoNLL.

[12]  Hsin-Hsi Chen,et al.  Writer Meets Reader: Emotion Analysis of Social Media from Both the Writer's and Reader's Perspectives , 2009, 2009 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology.

[13]  Davide Buscaldi,et al.  From humor recognition to irony detection: The figurative language of social media , 2012, Data Knowl. Eng..

[14]  Herbert L. Colston,et al.  Contrast of Kind Versus Contrast of Magnitude: The Pragmatic Accomplishments of Irony and Hyperbole , 2000 .

[15]  Eugenie Giesbrecht,et al.  Automatic Identification of Non-Compositional Multi-Word Expressions using Latent Semantic Analysis , 2006 .

[16]  Caroline Sporleder,et al.  Linguistic Cues for Distinguishing Literal and Non-Literal Usages , 2010, COLING.

[17]  Nina Wacholder,et al.  Identifying Sarcasm in Twitter: A Closer Look , 2011, ACL.