论文信息 - Taking a Stance on Fake News: Towards Automatic Disinformation Assessment via Deep Bidirectional Transformer Language Models for Stance Detection

Taking a Stance on Fake News: Towards Automatic Disinformation Assessment via Deep Bidirectional Transformer Language Models for Stance Detection

The exponential rise of social media and digital news in the past decade has had the unfortunate consequence of escalating what the United Nations has called a global topic of concern: the growing prevalence of disinformation. Given the complexity and time-consuming nature of combating disinformation through human assessment, one is motivated to explore harnessing AI solutions to automatically assess news articles for the presence of disinformation. A valuable first step towards automatic identification of disinformation is stance detection, where given a claim and a news article, the aim is to predict if the article agrees, disagrees, takes no position, or is unrelated to the claim. Existing approaches in literature have largely relied on hand-engineered features or shallow learned representations (e.g., word embeddings) to encode the claim-article pairs, which can limit the level of representational expressiveness needed to tackle the high complexity of disinformation identification. In this work, we explore the notion of harnessing large-scale deep bidirectional transformer language models for encoding claim-article pairs in an effort to construct state-of-the-art stance detection geared for identifying disinformation. Taking advantage of bidirectional cross-attention between claim-article pairs via pair encoding with self-attention, we construct a large-scale language model for stance detection by performing transfer learning on a RoBERTa deep bidirectional transformer language model, and were able to achieve state-of-the-art performance (weighted accuracy of 90.01%) on the Fake News Challenge Stage 1 (FNC-I) benchmark. These promising results serve as motivation for harnessing such large-scale language models as powerful building blocks for creating effective AI solutions to combat disinformation.

[1] Miriam J. Metzger,et al. The science of fake news , 2018, Science.

[2] Sinan Aral,et al. The spread of true and false news online , 2018, Science.

[3] Adam Tauman Kalai,et al. Man is to Computer Programmer as Woman is to Homemaker? Debiasing Word Embeddings , 2016, NIPS.

[4] Alan W Black,et al. Measuring Bias in Contextualized Word Representations , 2019, Proceedings of the First Workshop on Gender Bias in Natural Language Processing.

[5] Daniel Jurafsky,et al. Word embeddings quantify 100 years of gender and ethnic stereotypes , 2017, Proceedings of the National Academy of Sciences.

[6] Lukasz Kaiser,et al. Attention is All you Need , 2017, NIPS.

[7] Ming-Wei Chang,et al. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding , 2019, NAACL.

[8] Samuel R. Bowman,et al. A Broad-Coverage Challenge Corpus for Sentence Understanding through Inference , 2017, NAACL.

[9] Pankaj K. Agarwal,et al. Toward Computational Fact-Checking , 2014, Proc. VLDB Endow..

[10] Cho-Jui Hsieh,et al. On the Robustness of Self-Attentive Models , 2019, ACL.

[11] Diana Inkpen,et al. A Dataset for Multi-Target Stance Detection , 2017, EACL.

[12] Jeffrey Dean,et al. Distributed Representations of Words and Phrases and their Compositionality , 2013, NIPS.

[13] Isabelle Augenstein,et al. A simple but tough-to-beat baseline for the Fake News Challenge stance detection task , 2017, ArXiv.

[14] Chris Callison-Burch,et al. Seeing Things from a Different Angle:Discovering Diverse Perspectives about Claims , 2019, NAACL.

[15] Andreas Vlachos,et al. Emergent: a novel data-set for stance classification , 2016, NAACL.

[16] Iryna Gurevych,et al. A Retrospective Analysis of the Fake News Challenge Stance-Detection Task , 2018, COLING.