MultiAzterTest@VaxxStance-IberLEF 2021: Identifying Stances with Language Models and Linguistic Features

Detecting stances is a Natural Language Processing task that has focused mainly on analysing debates and controversial topics. In this case, the VaxxStance@IberLEF 2021 shared task has focused on the Antivaxxers movement in Basque and Spanish tweets. In this paper, we present the participation of the MultiAzterTest team and test two approaches: a language model based approach and a linguistic and stylistic feature based approach. We also introduce the “one stance per tuiter@lari” heuristic to integrate contextual information. The best results are obtained with language models, but the linguistic and stylistic feature based approach offers more interpretability.

[1]  Ian H. Witten,et al.  The WEKA data mining software: an update , 2009, SKDD.

[2]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[3]  Proceedings of the Iberian Languages Evaluation Forum (IberLEF 2021) co-located with the Conference of the Spanish Society for Natural Language Processing (SEPLN 2021), XXXVII International Conference of the Spanish Society for Natural Language Processing., Málaga, Spain, September, 2021 , 2021, IberLEF@SEPLN.

[4]  Arantza Díaz de Ilarraza,et al.  Simple or Complex? Assessing the readability of Basque Texts , 2014, COLING.

[5]  Elisabetta Fersini,et al.  Profiling Italian Misogynist: An Empirical Study , 2020, ResTUP@LREC.

[6]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[7]  Paolo Rosso,et al.  Overview of the Task on Multimodal Stance Detection in Tweets on Catalan #1Oct Referendum , 2018, IberEval@SEPLN.

[8]  David Yarowsky,et al.  One Sense Per Discourse , 1992, HLT.

[9]  Itziar Gonzalez-Dios,et al.  MultiAzterTest: a Multilingual Analyzer on Multiple Levels of Language for Readability Assessment , 2021, ArXiv.

[10]  Viviana Patti,et al.  Hurtlex: A Multilingual Lexicon of Words to Hurt , 2018, CLiC-it.

[11]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[12]  Christopher D. Manning,et al.  Stanza: A Python Natural Language Processing Toolkit for Many Human Languages , 2020, ACL.

[13]  Eneko Agirre,et al.  "One Entity per Discourse" and "One Entity per Collocation" Improve Named-Entity Disambiguation , 2014, COLING.

[14]  Saif Mohammad,et al.  SemEval-2016 Task 6: Detecting Stance in Tweets , 2016, *SEMEVAL.

[15]  John C. Platt,et al.  Fast training of support vector machines using sequential minimal optimization, advances in kernel methods , 1999 .

[16]  Thomas Wolf,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[17]  Eibe Frank,et al.  Logistic Model Trees , 2003, Machine Learning.

[18]  Eneko Agirre,et al.  Give your Text Representation Models some Love: the Case for Basque , 2020, LREC.

[19]  Paolo Rosso,et al.  Multilingual stance detection in social media political debates , 2020, Comput. Speech Lang..