Exploiting stance hierarchies for cost-sensitive stance detection of Web documents

Fact checking is an essential challenge when combating fake news. Identifying documents that agree or disagree with a particular statement (claim) is a core task in this process. In this context, stance detection aims at identifying the position (stance) of a document towards a claim. Most approaches address this task through a 4-class classification model where the class distribution is highly imbalanced. Therefore, they are particularly ineffective in detecting the minority classes (for instance, 'disagree'), even though such instances are crucial for tasks such as fact-checking by providing evidence for detecting false claims. In this paper, we exploit the hierarchical nature of stance classes, which allows us to propose a modular pipeline of cascading binary classifiers, enabling performance tuning on a per step and class basis. We implement our approach through a combination of neural and traditional classification models that highlight the misclassification costs of minority classes. Evaluation results demonstrate state-of-the-art performance of our approach and its ability to significantly improve the classification performance of the important 'disagree' class.

[1]  Qiang Zhang,et al.  From Stances' Imbalance to Their HierarchicalRepresentation and Detection , 2019, WWW.

[2]  M. Gentzkow,et al.  Social Media and Fake News in the 2016 Election , 2017 .

[3]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[4]  Nello Cristianini,et al.  Controlling the Sensitivity of Support Vector Machines , 1999 .

[5]  Marilyn A. Walker,et al.  Stance Classification using Dialogic Properties of Persuasion , 2012, NAACL.

[6]  Iryna Gurevych,et al.  CNN- and LSTM-based Claim Classification in Online User Comments , 2016, COLING.

[7]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[8]  Paolo Rosso,et al.  Friends and Enemies of Clinton and Trump: Using Context for Detecting Stance in Political Tweets , 2016, MICAI.

[9]  Andreas Vlachos,et al.  Emergent: a novel data-set for stance classification , 2016, NAACL.

[10]  Kaiming He,et al.  Focal Loss for Dense Object Detection , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[11]  Saif Mohammad,et al.  SemEval-2016 Task 6: Detecting Stance in Tweets , 2016, *SEMEVAL.

[12]  Xuezhi Wang,et al.  Relevant Document Discovery for Fact-Checking Articles , 2018, WWW.

[13]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[14]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[15]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[16]  A. Azzouz 2011 , 2020, City.

[17]  Chengkai Li,et al.  Detecting Check-worthy Factual Claims in Presidential Debates , 2015, CIKM.

[18]  John Glover,et al.  360° Stance Detection , 2018, NAACL-HLT.

[19]  Iryna Gurevych,et al.  A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.

[20]  Dejing Dou,et al.  Weakly Supervised Tweet Stance Classification by Relational Bootstrapping , 2016, EMNLP.

[21]  Ahmet Aker,et al.  The Fake News Challenge: Stance Detection using Traditional Machine Learning Approaches , 2018, KMIS.

[22]  Nitesh V. Chawla,et al.  SMOTE: Synthetic Minority Over-sampling Technique , 2002, J. Artif. Intell. Res..

[23]  Balasubramanian Raman,et al.  Combining Neural, Statistical and External Features for Fake News Stance Identification , 2018, WWW.

[24]  Cécile Paris,et al.  Cross-Target Stance Classification with Self-Attention Networks , 2018, ACL.

[25]  Jeffrey Dean,et al.  Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[26]  Pankaj K. Agarwal,et al.  Toward Computational Fact-Checking , 2014, Proc. VLDB Endow..

[27]  Ross B. Girshick,et al.  Focal Loss for Dense Object Detection , 2017, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[28]  Vincent Ng,et al.  Stance Classification of Ideological Debates: Data, Models, Features, and Constraints , 2013, IJCNLP.

[29]  Isabelle Augenstein,et al.  A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[30]  Indrajit Bhattacharya,et al.  Stance Classification of Context-Dependent Claims , 2017, EACL.

[31]  Gerhard Weikum,et al.  Credibility Assessment of Textual Claims on the Web , 2016, CIKM.

[32]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[33]  James R. Foulds,et al.  Joint Models of Disagreement and Stance in Online Debate , 2015, ACL.

[34]  Ruifeng Xu,et al.  Stance Classification with Target-specific Neural Attention , 2017, IJCAI.

[35]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[36]  Hans-Peter Kriegel,et al.  Integrating structured biological data by Kernel Maximum Mean Discrepancy , 2006, ISMB.

[37]  Guodong Zhou,et al.  Stance Detection with Hierarchical Attention Network , 2018, COLING.

[38]  Kurt Miller,et al.  Fake News Headline Classification using Neural Networks with Attention , 2017 .

[39]  Kalina Bontcheva,et al.  Stance Detection with Bidirectional Conditional Encoding , 2016, EMNLP.

[40]  Miriam J. Metzger,et al.  The science of fake news , 2018, Science.