Leveraging Multi-Source Weak Social Supervision for Early Detection of Fake News

Social media has greatly enabled people to participate in online activities at an unprecedented rate. However, this unrestricted access also exacerbates the spread of misinformation and fake news online which might cause confusion and chaos unless being detected early for its mitigation. Given the rapidly evolving nature of news events and the limited amount of annotated data, state-of-the-art systems on fake news detection face challenges due to the lack of large numbers of annotated training instances that are hard to come by for early detection. In this work, we exploit multiple weak signals from different sources given by user and content engagements (referred to as weak social supervision), and their complementary utilities to detect fake news. We jointly leverage the limited amount of clean data along with weak signals from social engagements to train deep neural networks in a meta-learning framework to estimate the quality of different weak instances. Experiments on realworld datasets demonstrate that the proposed framework outperforms state-of-the-art baselines for early detection of fake news without using any user engagements at prediction time.

[1]  Xiaogang Wang,et al.  Multi-source Deep Learning for Human Pose Estimation , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[2]  Ryan L. Boyd,et al.  The Development and Psychometric Properties of LIWC2015 , 2015 .

[3]  Daniel F. Stone,et al.  Media Bias in the Marketplace: Theory , 2014 .

[4]  Barbara Poblete,et al.  Information credibility on twitter , 2011, WWW.

[5]  Liang Ge,et al.  Multi-source deep learning for information trustworthiness estimation , 2013, KDD.

[6]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[7]  Fenglong Ma,et al.  EANN: Event Adversarial Neural Networks for Multi-Modal Fake News Detection , 2018, KDD.

[8]  Stefano Ermon,et al.  Label-Free Supervision of Neural Networks with Physics and Domain Knowledge , 2016, AAAI.

[9]  Suhang Wang,et al.  SAME: Sentiment-Aware Multi-Modal Embedding for Detecting Fake News , 2019, 2019 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[10]  Ahmed Hassan Awadallah,et al.  Meta Label Correction for Learning with Weak Supervision , 2019, ArXiv.

[11]  Yongdong Zhang,et al.  News Verification by Exploiting Conflicting Social Viewpoints in Microblogs , 2016, AAAI.

[12]  Richard Nock,et al.  Making Deep Neural Networks Robust to Label Noise: A Loss Correction Approach , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13]  Frederic Sala,et al.  Training Complex Models with Multi-Task Weak Supervision , 2018, AAAI.

[14]  Omer Levy,et al.  RoBERTa: A Robustly Optimized BERT Pretraining Approach , 2019, ArXiv.

[15]  Kai Shu Beyond News Contents: The Role of Social Context for Fake News Detection , 2018 .

[16]  Huan Liu,et al.  FakeNewsNet: A Data Repository with News Content, Social Context and Dynamic Information for Studying Fake News on Social Media , 2018, ArXiv.

[17]  Yan Liu,et al.  Neural User Response Generator: Fake News Detection with Collective User Intelligence , 2018, IJCAI.

[18]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[19]  Mohammad Ali Abbasi,et al.  Measuring User Credibility in Social Media , 2013, SBP.

[20]  Nagarajan Natarajan,et al.  Learning with Noisy Labels , 2013, NIPS.

[21]  Wei Gao,et al.  Detect Rumors Using Time Series of Social Context Information on Microblogging Websites , 2015, CIKM.

[22]  Krishna P. Gummadi,et al.  Quantifying Search Bias: Investigating Sources of Bias for Political Searches in Social Media , 2017, CSCW.

[23]  Bin Yang,et al.  Learning to Reweight Examples for Robust Deep Learning , 2018, ICML.

[24]  Joan Bruna,et al.  Training Convolutional Networks with Noisy Labels , 2014, ICLR 2014.

[25]  Christopher Ré,et al.  Snorkel: Rapid Training Data Creation with Weak Supervision , 2017, Proc. VLDB Endow..

[26]  Kevin Gimpel,et al.  Using Trusted Data to Train Deep Networks on Labels Corrupted by Severe Noise , 2018, NeurIPS.

[27]  Victoria L. Rubin,et al.  Truth and deception at the rhetorical structure level , 2015, J. Assoc. Inf. Sci. Technol..

[28]  Zhi-Hua Zhou,et al.  Learning From Incomplete and Inaccurate Supervision , 2019, IEEE Transactions on Knowledge and Data Engineering.

[29]  Dumitru Erhan,et al.  Training Deep Neural Networks on Noisy Labels with Bootstrapping , 2014, ICLR.

[30]  Frederic Sala,et al.  Learning Dependency Structures for Weak Supervision Models , 2019, ICML.

[31]  Benno Stein,et al.  A Stylometric Inquiry into Hyperpartisan and Fake News , 2017, ACL.

[32]  Eric Gilbert,et al.  VADER: A Parsimonious Rule-Based Model for Sentiment Analysis of Social Media Text , 2014, ICWSM.

[33]  Yoon Kim,et al.  Convolutional Neural Networks for Sentence Classification , 2014, EMNLP.

[34]  Yale Song,et al.  Learning from Noisy Labels with Distillation , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).