Will-They-Won’t-They: A Very Large Dataset for Stance Detection on Twitter

We present a new challenging stance detection dataset, called Will-They-Won't-They (WT-WT), which contains 51,284 tweets in English, making it by far the largest available dataset of the type. All the annotations are carried out by experts; therefore, the dataset constitutes a high-quality and reliable benchmark for future research in stance detection. Our experiments with a wide range of recent state-of-the-art stance detection systems show that the dataset poses a strong challenge to existing models in this domain.

[1]  Isabelle Augenstein,et al.  A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[2]  Diyi Yang,et al.  Hierarchical Attention Networks for Document Classification , 2016, NAACL.

[3]  Arkaitz Zubiaga,et al.  Stance Classification in Rumours as a Sequential Task Exploiting the Tree Structure of Social Media Conversations , 2016, COLING.

[4]  Cécile Paris,et al.  Cross-Target Stance Classification with Self-Attention Networks , 2018, ACL.

[5]  Jesse Hoey,et al.  Good News or Bad News: Using Affect Control Theory to Analyze Readers’ Reaction Towards News Articles , 2015, NAACL.

[6]  Diana Inkpen,et al.  A Dataset for Multi-Target Stance Detection , 2017, EACL.

[7]  Arkaitz Zubiaga,et al.  Stance Classification in Out-of-Domain Rumours: A Case Study Around Mental Health Disorders , 2017, SocInfo.

[8]  Avirup Saha,et al.  Can Siamese Networks help in stance detection? , 2019, COMAD/CODS.

[9]  Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, ACL 2018, Melbourne, Australia, July 15-20, 2018, Volume 1: Long Papers , 2018, ACL.

[10]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[11]  Arkaitz Zubiaga,et al.  Detection and Resolution of Rumours in Social Media , 2017, ACM Comput. Surv..

[12]  Arkaitz Zubiaga,et al.  SemEval-2017 Task 8: RumourEval: Determining rumour veracity and support for rumours , 2017, *SEMEVAL.

[13]  Saif Mohammad,et al.  Stance and Sentiment in Tweets , 2016, ACM Trans. Internet Techn..

[14]  Vincent Ng,et al.  Stance Classification of Ideological Debates: Data, Models, Features, and Constraints , 2013, IJCNLP.

[15]  Jacob Cohen A Coefficient of Agreement for Nominal Scales , 1960 .

[16]  Tejashri Inadarchand Jain,et al.  Recognizing Contextual Polarity in Phrase-Level Sentiment Analysis , 2010 .

[17]  Ruifeng Xu,et al.  Stance Classification with Target-specific Neural Attention , 2017, IJCAI.

[18]  Soroush Vosoughi,et al.  DeepStance at SemEval-2016 Task 6: Detecting Stance in Tweets Using Character and Word-Level CNNs , 2016, *SEMEVAL.

[19]  Saroj Kaushik,et al.  Topical Stance Detection for Twitter: A Two-Phase LSTM Model Using Attention , 2018, ECIR.

[20]  André Freitas,et al.  SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News , 2017, *SEMEVAL.

[21]  Andreas Vlachos,et al.  Fact Checking: Task definition and dataset construction , 2014, LTCSS@ACL.

[22]  Hsien-Chang Kuo,et al.  Merger and acquisitions: definitions, motives, and market responses , 2013 .

[23]  Jonas Mueller,et al.  Siamese Recurrent Architectures for Learning Sentence Similarity , 2016, AAAI.

[24]  Walid Magdy,et al.  Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media , 2019, Proc. ACM Hum. Comput. Interact..

[25]  Preslav Nakov,et al.  Integrating Stance Detection and Fact Checking in a Unified Corpus , 2018, NAACL.

[26]  Arkaitz Zubiaga,et al.  Discourse-aware rumour stance classification in social media using sequential classifiers , 2017, Inf. Process. Manag..

[27]  Hsin-Hsi Chen,et al.  NLG301 at SemEval-2017 Task 5: Fine-Grained Sentiment Analysis on Financial Microblogs and News , 2017, SemEval@ACL.

[28]  Shiyu Chang,et al.  A Co-Matching Model for Multi-choice Reading Comprehension , 2018, ACL.

[29]  Arkaitz Zubiaga,et al.  Analysing How People Orient to and Spread Rumours in Social Media by Looking at Conversational Threads , 2015, PloS one.

[30]  Isabelle Augenstein,et al.  Turing at SemEval-2017 Task 8: Sequential Approach to Rumour Stance Classification with Branch-LSTM , 2017, *SEMEVAL.

[31]  Margaret R. Garnsey,et al.  Natural Language Processing in Accounting, Auditing and Finance: A Synthesis of the Literature with a Roadmap for Future Research , 2016, Intell. Syst. Account. Finance Manag..

[32]  Saif Mohammad,et al.  A Dataset for Detecting Stance in Tweets , 2016, LREC.

[33]  Saif Mohammad,et al.  SemEval-2016 Task 6: Detecting Stance in Tweets , 2016, *SEMEVAL.

[34]  Arkaitz Zubiaga,et al.  All-in-one: Multi-task Learning for Rumour Verification , 2018, COLING.

[35]  Kalina Bontcheva,et al.  Pheme: Veracity in Digital Social Networks , 2014, UMAP Workshops.

[36]  Iryna Gurevych,et al.  A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.

[37]  Guodong Zhou,et al.  Stance Detection with Hierarchical Attention Network , 2018, COLING.

[38]  Samuel W. K. Chan,et al.  Sentiment analysis in financial texts , 2017, Decis. Support Syst..