SemEval-2018 Task 3: Irony Detection in English Tweets

This paper presents the first shared task on irony detection: given a tweet, automatic natural language processing systems should determine whether the tweet is ironic (Task A) and which type of irony (if any) is expressed (Task B). The ironic tweets were collected using irony-related hashtags (i.e. #irony, #sarcasm, #not) and were subsequently manually annotated to minimise the amount of noise in the corpus. Prior to distributing the data, hashtags that were used to collect the tweets were removed from the corpus. For both tasks, a training corpus of 3,834 tweets was provided, as well as a test set containing 784 tweets. Our shared tasks received submissions from 43 teams for the binary classification Task A and from 31 teams for the multiclass Task B. The highest classification scores obtained for both subtasks are respectively F1= 0.71 and F1= 0.51 and demonstrate that fine-grained irony classification is much more challenging than binary irony detection.

[1]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[2]  Bing Liu,et al.  Sentiment Analysis and Opinion Mining , 2012, Synthesis Lectures on Human Language Technologies.

[3]  Finn Årup Nielsen,et al.  A New ANEW: Evaluation of a Word List for Sentiment Analysis in Microblogs , 2011, #MSM.

[4]  Yann LeCun,et al.  Signature Verification Using A "Siamese" Time Delay Neural Network , 1993, Int. J. Pattern Recognit. Artif. Intell..

[5]  Jun Hong,et al.  Sarcasm Detection on Czech and English Twitter , 2014, COLING.

[6]  Diana Maynard,et al.  Who cares about Sarcastic Tweets? Investigating the Impact of Sarcasm on Sentiment Analysis. , 2014, LREC.

[7]  Pushpak Bhattacharyya,et al.  Automatic Sarcasm Detection , 2016, ACM Comput. Surv..

[8]  Raj Kumar Gupta,et al.  CrystalNest at SemEval-2017 Task 4: Using Sarcasm Detection for Enhancing Sentiment Classification and Quantification , 2017, *SEMEVAL.

[9]  Chuhan Wu,et al.  THU_NGN at SemEval-2018 Task 3: Tweet Irony Detection with Densely connected LSTM and Multi-task Learning , 2018, *SEMEVAL.

[10]  Ari Rappoport,et al.  Semi-Supervised Recognition of Sarcasm in Twitter and Amazon , 2010, CoNLL.

[11]  Bing Liu,et al.  Mining and summarizing customer reviews , 2004, KDD.

[12]  Anil Kumar Singh,et al.  NLPRL-IITBHU at SemEval-2018 Task 3: Combining Linguistic Features and Emoji pre-trained CNN for Irony Detection in Tweets , 2018, *SEMEVAL.

[13]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[14]  Véronique Hoste,et al.  Monday mornings are my fave : ) #not Exploring the Automatic Recognition of Irony in English tweets , 2016, COLING.

[15]  Saif Mohammad,et al.  Emotion Intensities in Tweets , 2017, *SEMEVAL.

[16]  Antal van den Bosch,et al.  Signaling sarcasm: From hyperbole to hashtag , 2015, Inf. Process. Manag..

[17]  Byron C. Wallace,et al.  Modelling Context with User Embeddings for Sarcasm Detection in Social Media , 2016, CoNLL.

[18]  Horacio Saggion,et al.  Modelling Irony in Twitter , 2014, EACL.

[19]  David Bamman,et al.  Contextualized Sarcasm Detection on Twitter , 2015, ICWSM.

[20]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[21]  Tomoaki Ohtsuki,et al.  Sarcasm Detection in Twitter: "All Your Products Are Incredibly Amazing!!!" - Are They Really? , 2014, 2015 IEEE Global Communications Conference (GLOBECOM).

[22]  Nathalie Aussenac-Gilles,et al.  Towards a Contextual Pragmatic Model to Detect Irony in Tweets , 2015, ACL.

[23]  Pushpak Bhattacharyya,et al.  Harnessing Cognitive Features for Sarcasm Detection , 2016, ACL.

[24]  Paolo Rosso,et al.  Irony Detection in Twitter , 2016, ACM Trans. Internet Techn..

[25]  Els Lefever,et al.  Guidelines for Annotating Irony in Social Media Text, version 2.0 , 2016 .

[26]  Nina Wacholder,et al.  Identifying Sarcasm in Twitter: A Closer Look , 2011, ACL.

[27]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[28]  Tony Veale,et al.  IronyMagnet at SemEval-2018 Task 3: A Siamese network for Irony detection in Social media , 2018, *SEMEVAL.

[29]  Iyad Rahwan,et al.  Using millions of emoji occurrences to learn any-domain representations for detecting sentiment, emotion and sarcasm , 2017, EMNLP.

[30]  Björn W. Schuller,et al.  SenticNet 4: A Semantic Resource for Sentiment Analysis Based on Conceptual Primitives , 2016, COLING.

[31]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[32]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[33]  Dai Quoc Nguyen,et al.  NIHRIO at SemEval-2018 Task 3: A Simple and Accurate Neural Network Model for Irony Detection in Twitter , 2018, *SEMEVAL.

[34]  Paolo Rosso,et al.  A multidimensional approach for detecting irony in Twitter , 2013, Lang. Resour. Evaluation.

[35]  J. Fleiss Measuring nominal scale agreement among many raters. , 1971 .

[36]  Sampo Pyysalo,et al.  brat: a Web-based Tool for NLP-Assisted Text Annotation , 2012, EACL.

[37]  Anna Rumshisky,et al.  SemEval-2017 Task 6: #HashtagWars: Learning a Sense of Humor , 2017, *SEMEVAL.

[38]  Preslav Nakov,et al.  SemEval-2016 Task 4: Sentiment Analysis in Twitter , 2016, *SEMEVAL.

[39]  Paolo Rosso,et al.  SemEval-2015 Task 11: Sentiment Analysis of Figurative Language in Twitter , 2015, *SEMEVAL.

[40]  Byron C. Wallace Computational irony: A survey and new perspectives , 2013, Artificial Intelligence Review.

[41]  Tony Veale,et al.  Fracking Sarcasm using Neural Network , 2016, WASSA@NAACL-HLT.

[42]  Jean Carletta,et al.  Assessing Agreement on Classification Tasks: The Kappa Statistic , 1996, CL.

[43]  Ellen Riloff,et al.  Sarcasm as Contrast between a Positive Sentiment and Negative Situation , 2013, EMNLP.

[44]  Richard Evans,et al.  WLV at SemEval-2018 Task 3: Dissecting Tweets in Search of Irony , 2018, *SEMEVAL.

[45]  Cynthia Van Hee Can machines sense irony? : exploring automatic irony detection on social media , 2017 .

[46]  Preslav Nakov,et al.  SemEval-2014 Task 9: Sentiment Analysis in Twitter , 2014, *SEMEVAL.

[47]  Veselin Stoyanov,et al.  Evaluation Measures for the SemEval-2016 Task 4 “Sentiment Analysis in Twitter” (Draft: Version 1.13) , 2016 .

[48]  C. Shelley The bicoherence theory of situational irony , 2001 .