How Effectively Can Machines Defend Against Machine-Generated Fake News? An Empirical Study

We empirically study the effectiveness of machine-generated fake news detectors by understanding the model’s sensitivity to different synthetic perturbations during test time. The current machine-generated fake news detectors rely on provenance to determine the veracity of news. Our experiments find that the success of these detectors can be limited since they are rarely sensitive to semantic perturbations and are very sensitive to syntactic perturbations. Also, we would like to open-source our code and believe it could be a useful diagnostic tool for evaluating models aimed at fighting machine-generated fake news.

[1]  Sinan Aral,et al.  The spread of true and false news online , 2018, Science.

[2]  Lav R. Varshney,et al.  Limits of Detecting Text Generated by Large-Scale Language Models , 2020, 2020 Information Theory and Applications Workshop (ITA).

[3]  Alec Radford,et al.  Release Strategies and the Social Impacts of Language Models , 2019, ArXiv.

[4]  Haonan Chen,et al.  Combining Fact Extraction and Verification with Neural Semantic Matching Networks , 2018, AAAI.

[5]  Christian Hansen,et al.  MultiFC: A Real-World Multi-Domain Dataset for Evidence-Based Fact Checking of Claims , 2019, EMNLP.

[6]  William Yang Wang “Liar, Liar Pants on Fire”: A New Benchmark Dataset for Fake News Detection , 2017, ACL.

[7]  Ali Farhadi,et al.  Defending Against Neural Fake News , 2019, NeurIPS.

[8]  Alexander M. Rush,et al.  GLTR: Statistical Detection and Visualization of Generated Text , 2019, ACL.

[9]  Andreas Vlachos,et al.  FEVER: a Large-scale Dataset for Fact Extraction and VERification , 2018, NAACL.

[10]  Bowen Zhou,et al.  Sequence-to-Sequence RNNs for Text Summarization , 2016, ArXiv.

[11]  Nina Mazar,et al.  The Dishonesty of Honest People: A Theory of Self-Concept Maintenance , 2008 .

[12]  Chris Callison-Burch,et al.  Human and Automatic Detection of Generated Text , 2019, ArXiv.

[13]  Gerhard Weikum,et al.  DeClarE: Debunking Fake News and False Claims using Evidence-Aware Deep Learning , 2018, EMNLP.

[14]  Christopher D. Manning,et al.  Get To The Point: Summarization with Pointer-Generator Networks , 2017, ACL.

[15]  Seyed-Mohsen Moosavi-Dezfooli,et al.  Universal Adversarial Attacks on Text Classifiers , 2019, ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[16]  Carlos Guestrin,et al.  Semantically Equivalent Adversarial Rules for Debugging NLP models , 2018, ACL.

[17]  Ilya Sutskever,et al.  Language Models are Unsupervised Multitask Learners , 2019 .

[18]  Regina Barzilay,et al.  Are We Safe Yet? The Limitations of Distributional Features for Fake News Detection , 2019, ArXiv.

[19]  Sameer Singh,et al.  Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.

[20]  Lukasz Kaiser,et al.  Attention is All you Need , 2017, NIPS.

[21]  Suhang Wang,et al.  Fake News Detection on Social Media: A Data Mining Perspective , 2017, SKDD.

[22]  Justin Hsu,et al.  Fake News Detection via NLP is Vulnerable to Adversarial Attacks , 2019, ICAART.

[23]  Eunsol Choi,et al.  Truth of Varying Shades: Analyzing Language in Fake News and Political Fact-Checking , 2017, EMNLP.

[24]  Marc'Aurelio Ranzato,et al.  Real or Fake? Learning to Discriminate Machine from Human Generated Text , 2019, ArXiv.

[25]  Lav R. Varshney,et al.  Pretrained AI Models: Performativity, Mobility, and Change , 2019, ArXiv.

[26]  Christopher Joseph Pal,et al.  Do Neural Dialog Systems Use the Conversation History Effectively? An Empirical Study , 2019, ACL.