A novel approach to stance detection in social media tweets by fusing ranked lists and sentiments

Abstract Stance detection is a relatively new concept in data mining that aims to assign a stance label (favor, against, or none) to a social media post towards a specific pre-determined target. These targets may not be referred to in the post, and may not be the target of opinion in the post. In this paper, we propose a novel enhanced method for identifying the writer’s stance of a given tweet. This comprises a three-phase process for stance detection: (a) tweets preprocessing; here we clean and normalize tweets (e.g., remove stop-words) to generate words and stems lists, (b) features generation; in this step, we create and fuse two dictionaries for generating features vector, and lastly (c) classification; all the instances of the features are classified based on the list of targets. Our innovative feature selection proposes fusion of two ranked lists (top- k ) of term frequency-inverse document frequency (tf-idf) scores and the sentiment information. We evaluate our method using six different classifiers: K nearest neighbor (K-NN), discernibility-based K-NN, weighted K-NN, class-based K-NN, exemplar-based K-NN, and Support Vector Machines. Furthermore, we investigate the use of Principal Component Analysis and study its effect on performance. The model is evaluated on the benchmark dataset (SemEval-2016 task 6), and the results significance is determined using t-test. We achieve our best performance of macro F -score (averaged across all topics) of 76.45% using the weighted K-NN classifier. This tops the current state-of-the-art score of 74.44% on the same dataset.

[1]  Iryna Gurevych,et al.  Stance Detection Benchmark: How Robust is Your Stance Detection? , 2020, KI - Künstliche Intelligenz.

[2]  Claire Cardie,et al.  MPQA Opinion Corpus , 2017 .

[3]  Ewan Klein,et al.  Natural Language Processing with Python , 2009 .

[4]  Saif Mohammad,et al.  Stance and Sentiment in Tweets , 2016, ACM Trans. Internet Techn..

[5]  Haitao Liu,et al.  An improved KNN text classification algorithm based on density , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[6]  I. Jolliffe Principal Component Analysis , 2002 .

[7]  E. Baccarelli,et al.  Why Should We Add Early Exits to Neural Networks? , 2020, Cognitive Computation.

[8]  Elena Zotova Automatic stance detection on political discourse in Twitter , 2019 .

[9]  Guido Zarrella,et al.  MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection , 2016, *SEMEVAL.

[10]  Klaus Hechenbichler,et al.  Weighted k-Nearest-Neighbor Techniques and Ordinal Classification , 2004 .

[11]  Amir Hussain,et al.  Computational and natural language processing based studies of hadith literature: a survey , 2019, Artificial Intelligence Review.

[12]  Paolo Rosso,et al.  Overview of the Task on Stance and Gender Detection in Tweets on Catalan Independence , 2017, IberEval@SEPLN.

[13]  Nigel Collier,et al.  Modeling the Fake News Challenge as a Cross-Level Stance Detection Task , 2018, CIKM Workshops.

[14]  Parinaz Sobhani Stance Detection and Analysis in Social Media , 2017 .

[15]  Arkaitz Zubiaga,et al.  SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours , 2017, *SEMEVAL.

[16]  Rada Mihalcea,et al.  Analyzing Connections Between User Attributes, Images, and Text , 2020, Cognitive Computation.

[17]  Aqil M. Azmi,et al.  A Study of Arabic Social Media Users—Posting Behavior and Author’s Gender Prediction , 2018, Cognitive Computation.

[18]  Anand Rajaraman,et al.  Mining of Massive Datasets , 2011 .

[19]  Paolo Rosso,et al.  Multilingual stance detection in social media political debates , 2020, Comput. Speech Lang..

[20]  B. Richards Type/Token Ratios: what do they really tell us? , 1987, Journal of Child Language.

[21]  Mei-Ling Shyu,et al.  Efficient Large-Scale Stance Detection in Tweets , 2018, Int. J. Multim. Data Eng. Manag..

[22]  Hongfei Lin,et al.  Improving User Attribute Classification with Text and Social Network Attention , 2019, Cognitive Computation.

[23]  Junlan Feng,et al.  Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[24]  Amir Hussain,et al.  Large-scale Ensemble Model for Customer Churn Prediction in Search Ads , 2018, Cognitive Computation.

[25]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26]  Paolo Rosso,et al.  Stance polarity in political debates: A diachronic perspective of network homophily and conversations on Twitter , 2019, Data Knowl. Eng..

[27]  Bilel Elayeb,et al.  Automatic Arabic Text Summarization Using Analogical Proportions , 2020, Cognitive Computation.

[28]  Soroush Vosoughi,et al.  DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs , 2016, *SEMEVAL.

[29]  Josef Steinberger,et al.  Stance detection in online discussions , 2017, ArXiv.

[30]  Aqil M. Azmi,et al.  An abstractive Arabic text summarizer with user controlled granularity , 2018, Inf. Process. Manag..

[31]  Kentaro Inui,et al.  Stance Detection Attending External Knowledge from Wikipedia , 2019, J. Inf. Process..