Cross-Domain Label-Adaptive Stance Detection

Stance detection concerns the classification of a writer’s viewpoint towards a target. There are different task variants, e.g., stance of a tweet vs. a full article, or stance with respect to a claim vs. an (implicit) topic. Moreover, task definitions vary, which includes the label inventory, the data collection, and the annotation protocol. All these aspects hinder cross-domain studies, as they require changes to standard domain adaptation approaches. In this paper, we perform an in-depth analysis of 16 stance detection datasets, and we explore the possibility for cross-domain learning from them. Moreover, we propose an end-to-end unsupervised framework for outof-domain prediction of unseen, user-defined labels. In particular, we combine domain adaptation techniques such as mixture of experts and domain-adversarial training with label embeddings, and we demonstrate sizable performance gains over strong baselines, both (i) indomain, i.e., for seen targets, and (ii) out-ofdomain, i.e., for unseen targets. Finally, we perform an exhaustive analysis of the crossdomain results, and we highlight the important factors influencing the model performance.

[1]  Diana Inkpen,et al.  A Dataset for Multi-Target Stance Detection , 2017, EACL.

[2]  Stan Matwin,et al.  From Argumentation Mining to Stance Classification , 2015, ArgMining@HLT-NAACL.

[3]  Andreas Vlachos,et al.  Emergent: a novel data-set for stance classification , 2016, NAACL.

[4]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[5]  Yang Yang,et al.  A Survey on Opinion Mining: From Stance to Product Aspect , 2019, IEEE Access.

[6]  I. Dhillon,et al.  Taming Pretrained Transformers for Extreme Multi-label Text Classification , 2019, KDD.

[7]  François Laviolette,et al.  Domain-Adversarial Training of Neural Networks , 2015, J. Mach. Learn. Res..

[8]  Nigel Collier,et al.  Will-They-Won’t-They: A Very Large Dataset for Stance Detection on Twitter , 2020, ACL.

[9]  Steven Bethard,et al.  Does BERT need domain adaptation for clinical negation detection? , 2020, J. Am. Medical Informatics Assoc..

[10]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[11]  Sameer Singh,et al.  COVIDLies: Detecting COVID-19 Misinformation on Social Media , 2020, NLP4COVID@EMNLP.

[12]  Kalina Bontcheva,et al.  Stance Detection with Bidirectional Conditional Encoding , 2016, EMNLP.

[13]  Sebastian Stabinger,et al.  Adapt or Get Left Behind: Domain Adaptation through BERT Language Model Finetuning for Aspect-Target Sentiment Classification , 2020, LREC.

[14]  Doug Downey,et al.  Don’t Stop Pretraining: Adapt Language Models to Domains and Tasks , 2020, ACL.

[15]  Ming Zhou,et al.  Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification , 2014, ACL.

[16]  Joachim Bingel,et al.  Disembodied Machine Learning: On the Illusion of Objectivity in NLP , 2021, ArXiv.

[17]  Cornelia Caragea,et al.  Stance Detection in COVID-19 Tweets , 2021, ACL.

[18]  Preslav Nakov,et al.  Unsupervised User Stance Detection on Twitter , 2019, ICWSM.

[19]  Iryna Gurevych,et al.  Stance Detection Benchmark: How Robust is Your Stance Detection? , 2020, KI - Künstliche Intelligenz.

[20]  Victor S. Lempitsky,et al.  Unsupervised Domain Adaptation by Backpropagation , 2014, ICML.

[21]  Preslav Nakov,et al.  Integrating Stance Detection and Fact Checking in a Unified Corpus , 2018, NAACL.

[22]  Lysandre Debut,et al.  HuggingFace's Transformers: State-of-the-art Natural Language Processing , 2019, ArXiv.

[23]  Hal Daumé,et al.  Frustratingly Easy Domain Adaptation , 2007, ACL.

[24]  Indrajit Bhattacharya,et al.  Stance Classification of Context-Dependent Claims , 2017, EACL.

[25]  Isabelle Augenstein,et al.  Transformer Based Multi-Source Domain Adaptation , 2020, EMNLP.

[26]  John Blitzer,et al.  Domain Adaptation with Structural Correspondence Learning , 2006, EMNLP.

[27]  Ramesh Nallapati,et al.  Domain Adaptation with BERT-based Domain Classification and Data Selection , 2019, EMNLP.

[28]  Saif Mohammad,et al.  Stance and Sentiment in Tweets , 2016, ACM Trans. Internet Techn..

[29]  Noam Slonim,et al.  A Benchmark Dataset for Automatic Detection of Claims and Evidence in the Context of Controversial Topics , 2014, ArgMining@ACL.

[30]  Jacob Eisenstein,et al.  Unsupervised Domain Adaptation of Contextualized Embeddings for Sequence Labeling , 2019, EMNLP.

[31]  Arkaitz Zubiaga,et al.  SemEval-2019 Task 7: RumourEval, Determining Rumour Veracity and Support for Rumours , 2019, *SEMEVAL.

[32]  Preslav Nakov,et al.  Contrastive Language Adaptation for Cross-Lingual Stance Detection , 2019, EMNLP.

[33]  Nigel Collier,et al.  STANDER: An Expert-Annotated Dataset for News Stance Detection and Evidence Retrieval , 2020, FINDINGS.

[34]  LiakataMaria,et al.  Detection and Resolution of Rumours in Social Media , 2018 .

[35]  Christian Hansen,et al.  MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims , 2019, EMNLP.

[36]  Iryna Gurevych,et al.  A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.

[37]  Iryna Gurevych,et al.  A Richly Annotated Corpus for Different Tasks in Automated Fact-Checking , 2019, CoNLL.

[38]  Jimmy Ba,et al.  Adam: A Method for Stochastic Optimization , 2014, ICLR.

[39]  Preslav Nakov,et al.  Automatic Stance Detection Using End-to-End Memory Networks , 2018, NAACL.

[40]  Vincent Ng,et al.  Why are You Taking this Stance? Identifying and Classifying Reasons in Ideological Debates , 2014, EMNLP.

[41]  Chong-Wah Ngo,et al.  Semi-supervised Domain Adaptation with Subspace Learning for visual recognition , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[42]  Leonardo Neves,et al.  TweetEval: Unified Benchmark and Comparative Evaluation for Tweet Classification , 2020, FINDINGS.

[43]  Xuanjing Huang,et al.  Part-of-Speech Tagging for Twitter with Adversarial Neural Networks , 2017, EMNLP.

[44]  Geoffrey E. Hinton,et al.  Visualizing Data using t-SNE , 2008 .

[45]  Danqi Chen,et al.  of the Association for Computational Linguistics: , 2001 .

[46]  Tomas Mikolov,et al.  Bag of Tricks for Efficient Text Classification , 2016, EACL.

[47]  Vincent Ng,et al.  Stance Classification of Ideological Debates: Data, Models, Features, and Constraints , 2013, IJCNLP.

[48]  Christopher D. Manning,et al.  Hierarchical Bayesian Domain Adaptation , 2009, NAACL.

[49]  Saif Mohammad,et al.  SemEval-2016 Task 6: Detecting Stance in Tweets , 2016, *SEMEVAL.

[50]  Wei Lu,et al.  Neural Adaptation Layers for Cross-domain Named Entity Recognition , 2018, EMNLP.

[51]  Andrew McCallum,et al.  Energy and Policy Considerations for Deep Learning in NLP , 2019, ACL.

[52]  Timothy Baldwin,et al.  What’s in a Domain? Learning Domain-Robust Text Representations using Adversarial Training , 2018, NAACL.

[53]  Chris Callison-Burch,et al.  Seeing Things from a Different Angle:Discovering Diverse Perspectives about Claims , 2019, NAACL.

[54]  Benno Stein,et al.  The Argument Reasoning Comprehension Task: Identification and Reconstruction of Implicit Warrants , 2017, NAACL.

[55]  Kathleen McKeown,et al.  Zero-Shot Stance Detection: A Dataset and Model Using Generalized Topic Representations , 2020, EMNLP.

[56]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[57]  Dragomir R. Radev,et al.  Rumor has it: Identifying Misinformation in Microblogs , 2011, EMNLP.

[58]  Jiwei Li,et al.  Description Based Text Classification with Reinforcement Learning , 2020, ICML.

[59]  Iryna Gurevych,et al.  Cross-topic Argument Mining from Heterogeneous Sources , 2018, EMNLP.

[60]  James Henderson,et al.  GILE: A Generalized Input-Label Embedding for Text Classification , 2018, TACL.

[61]  Isabelle Augenstein,et al.  A Survey on Stance Detection for Mis- and Disinformation Identification , 2021, NAACL-HLT.

[62]  Swapna Somasundaran,et al.  Recognizing Stances in Ideological On-Line Debates , 2010, HLT-NAACL 2010.

[63]  Walid Magdy,et al.  Your Stance is Exposed! Analysing Possible Factors for Stance Detection on Social Media , 2019, Proc. ACM Hum. Comput. Interact..

[64]  Isabelle Augenstein,et al.  Back to the Future - Sequential Alignment of Text Representations , 2019, ArXiv.

[65]  Trevor Darrell,et al.  Semi-supervised Domain Adaptation with Instance Constraints , 2013, 2013 IEEE Conference on Computer Vision and Pattern Recognition.

[66]  Regina Barzilay,et al.  Multi-Source Domain Adaptation with Mixture of Experts , 2018, EMNLP.

[67]  Preslav Nakov,et al.  Predicting the Topical Stance and Political Leaning of Media using Tweets , 2020, ACL.

[68]  Marilyn A. Walker,et al.  A Corpus for Research on Deliberation and Debate , 2012, LREC.

[69]  Alexander J. Smola,et al.  Detecting and Correcting for Label Shift with Black Box Predictors , 2018, ICML.

[70]  Steven Bird,et al.  NLTK: The Natural Language Toolkit , 2002, ACL.

[71]  Rui Yan,et al.  How Transferable are Neural Networks in NLP Applications? , 2016, EMNLP.

[72]  Natalia Gimelshein,et al.  PyTorch: An Imperative Style, High-Performance Deep Learning Library , 2019, NeurIPS.

[73]  Jan Snajder,et al.  Back up your Stance: Recognizing Arguments in Online Discussions , 2014, ArgMining@ACL.

[74]  Isabelle Augenstein,et al.  Long-Tail Zero and Few-Shot Learning via Contrastive Pretraining on and for Small Data , 2020, ArXiv.

[75]  Preslav Nakov,et al.  Adversarial Domain Adaptation for Duplicate Question Detection , 2018, EMNLP.

[76]  Idan Szpektor,et al.  A Joint Named-Entity Recognizer for Heterogeneous Tag-setsUsing a Tag Hierarchy , 2019, ACL.

[77]  Swapna Somasundaran,et al.  Recognizing Stances in Online Debates , 2009, ACL.

[78]  Isabelle Augenstein,et al.  Back to the Future - Temporal Adaptation of Text Representations , 2020, AAAI.

[79]  Isabelle Augenstein,et al.  Multi-Task Learning of Pairwise Sequence Classification Tasks over Disparate Label Spaces , 2018, NAACL.

[80]  Ming-Wei Chang,et al.  BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[81]  Yaohui Jin,et al.  Multi-Task Label Embedding for Text Classification , 2017, EMNLP.

[82]  Taku Kudo,et al.  SentencePiece: A simple and language independent subword tokenizer and detokenizer for Neural Text Processing , 2018, EMNLP.

[83]  Tunga Güngör,et al.  Part-of-Speech Tagging , 2005 .

[84]  Arkaitz Zubiaga,et al.  A longitudinal assessment of the persistence of twitter datasets , 2017, J. Assoc. Inf. Sci. Technol..