论文信息 - A novel approach to stance detection in social media tweets by fusing ranked lists and sentiments

A novel approach to stance detection in social media tweets by fusing ranked lists and sentiments

Abstract Stance detection is a relatively new concept in data mining that aims to assign a stance label (favor, against, or none) to a social media post towards a specific pre-determined target. These targets may not be referred to in the post, and may not be the target of opinion in the post. In this paper, we propose a novel enhanced method for identifying the writer’s stance of a given tweet. This comprises a three-phase process for stance detection: (a) tweets preprocessing; here we clean and normalize tweets (e.g., remove stop-words) to generate words and stems lists, (b) features generation; in this step, we create and fuse two dictionaries for generating features vector, and lastly (c) classification; all the instances of the features are classified based on the list of targets. Our innovative feature selection proposes fusion of two ranked lists (top- k ) of term frequency-inverse document frequency (tf-idf) scores and the sentiment information. We evaluate our method using six different classifiers: K nearest neighbor (K-NN), discernibility-based K-NN, weighted K-NN, class-based K-NN, exemplar-based K-NN, and Support Vector Machines. Furthermore, we investigate the use of Principal Component Analysis and study its effect on performance. The model is evaluated on the benchmark dataset (SemEval-2016 task 6), and the results significance is determined using t-test. We achieve our best performance of macro F -score (averaged across all topics) of 76.45% using the weighted K-NN classifier. This tops the current state-of-the-art score of 74.44% on the same dataset.

Amir Hussain | Aqil M. Azmi | Abdulrahman I. Al-Ghadir | A. Hussain | A. I. Al-Ghadir

[1] Iryna Gurevych,et al. Stance Detection Benchmark: How Robust is Your Stance Detection? , 2020, KI - Künstliche Intelligenz.

[2] Claire Cardie,et al. MPQA Opinion Corpus , 2017 .

[3] Ewan Klein,et al. Natural Language Processing with Python , 2009 .

[4] Saif Mohammad,et al. Stance and Sentiment in Tweets , 2016, ACM Trans. Internet Techn..

[5] Haitao Liu,et al. An improved KNN text classification algorithm based on density , 2011, 2011 IEEE International Conference on Cloud Computing and Intelligence Systems.

[6] I. Jolliffe. Principal Component Analysis , 2002 .

[7] E. Baccarelli,et al. Why Should We Add Early Exits to Neural Networks? , 2020, Cognitive Computation.

[8] Elena Zotova. Automatic stance detection on political discourse in Twitter , 2019 .

[9] Guido Zarrella,et al. MITRE at SemEval-2016 Task 6: Transfer Learning for Stance Detection , 2016, *SEMEVAL.

[10] Klaus Hechenbichler,et al. Weighted k-Nearest-Neighbor Techniques and Ordinal Classification , 2004 .

[11] Amir Hussain,et al. Computational and natural language processing based studies of hadith literature: a survey , 2019, Artificial Intelligence Review.

[12] Paolo Rosso,et al. Overview of the Task on Stance and Gender Detection in Tweets on Catalan Independence , 2017, IberEval@SEPLN.

[13] Nigel Collier,et al. Modeling the Fake News Challenge as a Cross-Level Stance Detection Task , 2018, CIKM Workshops.

[14] Parinaz Sobhani. Stance Detection and Analysis in Social Media , 2017 .

[15] Arkaitz Zubiaga,et al. SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours , 2017, *SEMEVAL.

[16] Rada Mihalcea,et al. Analyzing Connections Between User Attributes, Images, and Text , 2020, Cognitive Computation.

[17] Aqil M. Azmi,et al. A Study of Arabic Social Media Users—Posting Behavior and Author’s Gender Prediction , 2018, Cognitive Computation.

[18] Anand Rajaraman,et al. Mining of Massive Datasets , 2011 .

[19] Paolo Rosso,et al. Multilingual stance detection in social media political debates , 2020, Comput. Speech Lang..

[20] B. Richards. Type/Token Ratios: what do they really tell us? , 1987, Journal of Child Language.

[21] Mei-Ling Shyu,et al. Efficient Large-Scale Stance Detection in Tweets , 2018, Int. J. Multim. Data Eng. Manag..

[22] Hongfei Lin,et al. Improving User Attribute Classification with Text and Social Network Attention , 2019, Cognitive Computation.

[23] Junlan Feng,et al. Robust Sentiment Detection on Twitter from Biased and Noisy Data , 2010, COLING.

[24] Amir Hussain,et al. Large-scale Ensemble Model for Customer Churn Prediction in Search Ads , 2018, Cognitive Computation.

[25] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[26] Paolo Rosso,et al. Stance polarity in political debates: A diachronic perspective of network homophily and conversations on Twitter , 2019, Data Knowl. Eng..

[27] Bilel Elayeb,et al. Automatic Arabic Text Summarization Using Analogical Proportions , 2020, Cognitive Computation.

[28] Soroush Vosoughi,et al. DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs , 2016, *SEMEVAL.

[29] Josef Steinberger,et al. Stance detection in online discussions , 2017, ArXiv.

[30] Aqil M. Azmi,et al. An abstractive Arabic text summarizer with user controlled granularity , 2018, Inf. Process. Manag..

[31] Kentaro Inui,et al. Stance Detection Attending External Knowledge from Wikipedia , 2019, J. Inf. Process..