Defending Against Disinformation Attacks in Open-Domain Question Answering

Recent work in open-domain question answering (ODQA) has shown that adversarial poisoning of the search collection can cause large drops in accuracy for production systems. However, little to no work has proposed methods to defend against these attacks. To do so, we rely on the intuition that redundant information often exists in large corpora. To find it, we introduce a method that uses query augmentation to search for a diverse set of passages that could answer the original question but are less likely to have been poisoned. We integrate these new passages into the model through the design of a novel confidence method, comparing the predicted answer to its appearance in the retrieved contexts (what we call Confidence from Answer Redundancy, i.e. CAR). Together these methods allow for a simple but effective way to defend against poisoning attacks that provides gains of nearly 20% exact match across varying levels of data poisoning/knowledge conflicts.

[1]  Jonathan Berant,et al.  Making Retrieval-Augmented Language Models Robust to Irrelevant Context , 2023, ArXiv.

[2]  Benjamin Van Durme,et al.  When do Generative Query and Document Expansions Fail? A Comprehensive Study Across Methods, Retrievers, and Datasets , 2023, FINDINGS.

[3]  Eric Michael Smith,et al.  Llama 2: Open Foundation and Fine-Tuned Chat Models , 2023, ArXiv.

[4]  E. Xing,et al.  Judging LLM-as-a-judge with MT-Bench and Chatbot Arena , 2023, NeurIPS.

[5]  Benjamin Van Durme,et al.  NevIR: Negation in Neural Information Retrieval , 2023, ArXiv.

[6]  Benjamin Van Durme,et al.  When Do Decompositions Help for Machine Reading? , 2022, EMNLP.

[7]  Michael J.Q. Zhang,et al.  Rich Knowledge Sources Bring Complex Knowledge Conflicts: Recalibrating Models to Reflect Conflicting Evidence , 2022, EMNLP.

[8]  Jane A. Yu,et al.  Few-shot Learning with Retrieval Augmented Language Models , 2022, J. Mach. Learn. Res..

[9]  Christopher D. Manning,et al.  Synthetic Disinformation Attacks on Automated Fact Verification Systems , 2022, AAAI.

[10]  Edouard Grave,et al.  Unsupervised Dense Information Retrieval with Contrastive Learning , 2021, Trans. Mach. Learn. Res..

[11]  V. Claveau Neural text generation for query expansion in information retrieval , 2021, WI/IAT.

[12]  Nikhil Ramesh,et al.  Entity-Based Knowledge Conflicts in Question Answering , 2021, EMNLP.

[13]  Iadh Ounis,et al.  Pseudo-Relevance Feedback for Multiple Representation Dense Retrieval , 2021, ICTIR.

[14]  S. Riedel,et al.  Improving Question Answering Model Robustness with Synthetic Adversarial Data Generation , 2021, EMNLP.

[15]  Graham Neubig,et al.  How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering , 2020, Transactions of the Association for Computational Linguistics.

[16]  Edouard Grave,et al.  Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering , 2020, EACL.

[17]  John P. Dickerson,et al.  Just How Toxic is Data Poisoning? A Unified Benchmark for Backdoor and Data Poisoning Attacks , 2020, ICML.

[18]  Percy Liang,et al.  Selective Question Answering under Domain Shift , 2020, ACL.

[19]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[20]  D. Song,et al.  Imitation Attacks and Defenses for Black-box Machine Translation Systems , 2020, EMNLP.

[21]  Danqi Chen,et al.  Dense Passage Retrieval for Open-Domain Question Answering , 2020, EMNLP.

[22]  Donggyu Kim,et al.  Domain-agnostic Question-Answering with Adversarial Training , 2019, EMNLP.

[23]  Sameer Singh,et al.  Universal Adversarial Triggers for Attacking and Analyzing NLP , 2019, EMNLP.

[24]  Ming-Wei Chang,et al.  Natural Questions: A Benchmark for Question Answering Research , 2019, TACL.

[25]  Hwee Tou Ng,et al.  Improving the Robustness of Question Answering Systems to Question Paraphrasing , 2019, ACL.

[26]  Yoshua Bengio,et al.  HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering , 2018, EMNLP.

[27]  Eric Wallace,et al.  Trick Me If You Can: Human-in-the-Loop Generation of Adversarial Examples for Question Answering , 2018, TACL.

[28]  Christopher Clark,et al.  Simple and Effective Multi-Paragraph Reading Comprehension , 2017, ACL.

[29]  Eunsol Choi,et al.  TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension , 2017, ACL.

[30]  Kyunghyun Cho,et al.  SearchQA: A New Q&A Dataset Augmented with Context from a Search Engine , 2017, ArXiv.

[31]  Yi Yang,et al.  WikiQA: A Challenge Dataset for Open-Domain Question Answering , 2015, EMNLP.

[32]  Doug Downey,et al.  A Probabilistic Model of Redundancy in Information Extraction , 2005, IJCAI.

[33]  Claudio Carpineto,et al.  A Survey of Automatic Query Expansion in Information Retrieval , 2012, CSUR.

[34]  A. Singhal Modern Information Retrieval : A Brief Overview , 2001 .