Irony Detection in a Multilingual Context

This paper proposes the first multilingual (French, English and Arabic) and multicultural (Indo-European languages vs. less culturally close languages) irony detection system. We employ both feature-based models and neural architectures using monolingual word representation. We compare the performance of these systems with state-of-the-art systems to identify their capabilities. We show that these monolingual models trained separately on different languages using multilingual word representation or text-based features can open the door to irony detection in languages that lack of annotated data for irony.

[1]  Jeffrey Dean,et al.  Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.

[2]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[3]  Paolo Rosso,et al.  Irony Detection in Twitter , 2016, ACM Trans. Internet Techn..

[4]  Kevin Duh,et al.  Cross-Lingual Learning-to-Rank with Shared Representations , 2018, NAACL.

[5]  Cyril Grouin,et al.  Analyse d'opinion et langage figuratif dans des tweets : présentation et résultats du Défi Fouille de Textes DEFT2017 , 2017 .

[6]  Prakhar Gupta,et al.  Learning Word Vectors for 157 Languages , 2018, LREC.

[7]  Imed Zitouni,et al.  Multilingual Natural Language Processing Applications: From Theory to Practice , 2012 .

[8]  Guillaume Lample,et al.  Word Translation Without Parallel Data , 2017, ICLR.

[9]  Hsin-Hsi Chen,et al.  Disambiguating False-Alarm Hashtag Usages in Tweets for Irony Detection , 2018, ACL.

[10]  Hsin-Hsi Chen,et al.  Irony Detection with Attentive Recurrent Neural Networks , 2017, ECIR.

[11]  Mário J. Silva,et al.  Clues for detecting irony in user-generated contents: oh...!! it's "so easy" ;-) , 2009, TSA@CIKM.

[12]  Antal van den Bosch,et al.  The perfect solution for detecting sarcasm in tweets #not , 2013, WASSA@NAACL-HLT.

[13]  Paolo Rosso,et al.  IDAT at FIRE2019: Overview of the Track on Irony Detection in Arabic Tweets , 2019, FIRE.

[14]  Eneko Agirre,et al.  Unsupervised Neural Machine Translation , 2017, ICLR.

[15]  Herbert L. Colston Irony as Indirectness Cross-Linguistically: On the Scope of Generic Mechanisms , 2018, Perspectives in Pragmatics, Philosophy & Psychology.

[16]  Siobhan Chapman Logic and Conversation , 2005 .

[17]  Yuji Matsumoto,et al.  Unsupervised Multilingual Word Embedding with Limited Resources using Neural Language Models , 2019, ACL.

[18]  Eneko Agirre,et al.  Unsupervised Statistical Machine Translation , 2018, EMNLP.

[19]  S. Attardo Irony as relevant inappropriateness , 2000 .

[20]  Jian Su,et al.  Reasoning with Sarcasm by Reading In-Between , 2018, ACL.

[21]  Farah Benamara,et al.  SOUKHRIA: Towards an Irony Detection System for Arabic in Social Media , 2017, ACLING.

[22]  Paolo Rosso,et al.  Sentiment Polarity Classification of Figurative Language: Exploring the Role of Irony-Aware and Multifaceted Affect Features , 2017, CICLing.

[23]  Jeremy Barnes,et al.  Bilingual Sentiment Embeddings: Joint Projection of Sentiment Across Languages , 2018, ACL.

[24]  Raymond Chakhachiro Translating irony in political commentary texts from English into Arabic , 2007 .

[25]  Tony Veale,et al.  Creative Language Retrieval: A Robust Hybrid of Information Retrieval and Linguistic Creativity , 2011, ACL.

[26]  Jun Hong,et al.  Sarcasm Detection on Czech and English Twitter , 2014, COLING.

[27]  Luigi Di Caro,et al.  Annotating Irony in a Novel Italian Corpus for Sentiment Analysis , 2012 .

[28]  Nathalie Aussenac-Gilles,et al.  Towards a Contextual Pragmatic Model to Detect Irony in Tweets , 2015, ACL.

[29]  Hsin-Hsi Chen,et al.  Chinese Irony Corpus Construction and Ironic Structure Analysis , 2014, COLING.

[30]  Alexandra Balahur,et al.  Comparative experiments using supervised learning and machine translation for multilingual sentiment analysis , 2014, Comput. Speech Lang..

[31]  Paolo Rosso,et al.  LDR at SemEval-2018 Task 3: A Low Dimensional Text Representation for Irony Detection , 2018, SemEval@NAACL-HLT.

[32]  Goran Glavas,et al.  Unsupervised Cross-Lingual Information Retrieval Using Monolingual Data Only , 2018, SIGIR.

[33]  Tomoharu Iwata,et al.  Unsupervised Cross-lingual Word Embedding by Multilingual Neural Language Models , 2018, ArXiv.

[34]  Nathalie Aussenac-Gilles,et al.  Exploring the Impact of Pragmatic Phenomena on Irony Detection in Tweets: A Multilingual Corpus Study , 2017, EACL.

[35]  Geoffrey Zweig,et al.  Linguistic Regularities in Continuous Space Word Representations , 2013, NAACL.

[36]  Paolo Rosso,et al.  Overview of the Task on Irony Detection in Spanish Variants , 2019, IberLEF@SEPLN.

[37]  Jian Ni,et al.  Improving Multilingual Named Entity Recognition with Wikipedia Entity Type Mapping , 2016, EMNLP.

[38]  Véronique Hoste,et al.  SemEval-2018 Task 3: Irony Detection in English Tweets , 2018, *SEMEVAL.

[39]  Sebastian Ruder,et al.  A survey of cross-lingual embedding models , 2017, ArXiv.

[40]  Samhaa R. El-Beltagy,et al.  AraVec: A set of Arabic Word Embedding Models for use in Arabic NLP , 2017, ACLING.

[41]  Vinay Singh,et al.  A Corpus of English-Hindi Code-Mixed Tweets for Sarcasm Detection , 2018, ArXiv.