Semi-Supervised Recognition of Sarcasm in Twitter and Amazon

Sarcasm is a form of speech act in which the speakers convey their message in an implicit way. The inherently ambiguous nature of sarcasm sometimes makes it hard even for humans to decide whether an utterance is sarcastic or not. Recognition of sarcasm can benefit many sentiment analysis NLP applications, such as review summarization, dialogue systems and review ranking systems. In this paper we experiment with semisupervised sarcasm identification on two very different data sets: a collection of 5.9 million tweets collected from Twitter, and a collection of 66000 product reviews from Amazon. Using the Mechanical Turk we created a gold standard sample in which each sentence was tagged by 3 annotators, obtaining F-scores of 0.78 on the product reviews dataset and 0.83 on the Twitter dataset. We discuss the differences between the datasets and how the algorithm uses them (e.g., for the Amazon dataset the algorithm makes use of structured information). We also discuss the utility of Twitter #sarcasm hashtags for the task.

[1]  Akira Utsumi,et al.  A Unified Theory of Irony and Its Computational Formalization , 1996, COLING.

[2]  Carlo Strapparava,et al.  Making Computers Laugh: Investigations in Automatic Humor Recognition , 2005, HLT.

[3]  C. Colebrook The meaning of irony , 2000 .

[4]  A. Utsumi Verbal irony as implicit display of ironic environment: Distinguishing ironic utterances from nonirony☆ , 2000 .

[5]  G. Bryant Figurative language comprehension: Social and cultural influences , 2006 .

[6]  D. Muecke Irony and the Ironic , 1970 .

[7]  F. Stringfellow The meaning of irony , 1994 .

[8]  源可乐 教学型词典的新设计——评MacMillan English Dictionary , 2004 .

[9]  R. Gibbs On the psycholinguistics of sarcasm. , 1986 .

[10]  Oren Etzioni,et al.  Extracting Product Features and Opinions from Reviews , 2005, HLT.

[11]  R. Gibbs,et al.  Psychological aspects of irony understanding , 1991 .

[12]  Ming Zhou,et al.  Low-Quality Product Review Detection in Opinion Summarization , 2007, EMNLP.

[13]  南出 康世 Macmillan English dictionary , 2002 .

[14]  Rada Mihalcea,et al.  Characterizing Humour: An Exploration of Features in Humorous Texts , 2009, CICLing.

[15]  David R. Traum,et al.  "yeah Right": Sarcasm Recognition for Spoken Dialogue Systems , 2006, INTERSPEECH.

[16]  Siobhan Chapman Logic and Conversation , 2005 .

[17]  Ari Rappoport,et al.  Efficient Unsupervised Discovery of Word Categories Using Symmetric Patterns and High Frequency Words , 2006, ACL.

[18]  Janyce Wiebe,et al.  Learning Subjective Language , 2004, CL.

[19]  Deirdre Wilson,et al.  On verbal irony , 1992 .

[20]  Bo Pang,et al.  A Sentimental Education: Sentiment Analysis Using Subjectivity Summarization Based on Minimum Cuts , 2004, ACL.

[21]  Timothy Baldwin,et al.  Automatic Satire Detection: Are You Having a Laugh? , 2009, ACL.

[22]  Ari Rappoport,et al.  ICWSM - A Great Catchy Name: Semi-Supervised Recognition of Sarcastic Sentences in Online Product Reviews , 2010, ICWSM.

[23]  Ari Rappoport,et al.  RevRank: A Fully Unsupervised Algorithm for Selecting the Most Helpful Book Reviews , 2009, ICWSM.

[24]  Jon M. Kleinberg,et al.  WWW 2009 MADRID! Track: Data Mining / Session: Opinions How Opinions are Received by Online Communities: A Case Study on Amazon.com Helpfulness Votes , 2022 .

[25]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.