Detecting Pretraining Data from Large Language Models
暂无分享,去创建一个
Danqi Chen | Yangsibo Huang | Terra Blevins | Luke Zettlemoyer | Anirudh Ajith | Mengzhou Xia | Weijia Shi | Daogao Liu
[1] Ronen Eldan,et al. Who's Harry Potter? Approximate Unlearning in LLMs , 2023, ArXiv.
[2] M. Surdeanu,et al. Time Travel in LLMs: Tracing Data Contamination in Large Language Models , 2023, ICLR.
[3] Noah A. Smith,et al. SILO Language Models: Isolating Legal Risk In a Nonparametric Datastore , 2023, ArXiv.
[4] Taylor Berg-Kirkpatrick,et al. Membership Inference Attacks against Language Models via Neighbourhood Comparison , 2023, ACL.
[5] T. Steinke,et al. Privacy Auditing with One (1) Training Run , 2023, ArXiv.
[6] David Bamman,et al. Speak, Memory: An Archaeology of Books Known to ChatGPT/GPT-4 , 2023, EMNLP.
[7] Ryan M. Rogers,et al. Challenges towards the Next Frontier in Privacy , 2023, ArXiv.
[8] Naman Goyal,et al. LLaMA: Open and Efficient Foundation Language Models , 2023, ArXiv.
[9] Florian Tramèr,et al. Tight Auditing of Differentially Private Machine Learning , 2023, USENIX Security Symposium.
[10] Christopher D. Manning,et al. DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature , 2023, ICML.
[11] Yangsibo Huang,et al. A Dataset Auditing Method for Collaboratively Trained Machine Learning Models , 2022, IEEE Transactions on Medical Imaging.
[12] Xinchao Wang,et al. Learning with Recoverable Forgetting , 2022, ECCV.
[13] Danqi Chen,et al. Recovering Private Text in Federated Learning of Language Models , 2022, NeurIPS.
[14] Xi Victoria Lin,et al. OPT: Open Pre-trained Transformer Language Models , 2022, ArXiv.
[15] Stella Rose Biderman,et al. GPT-NeoX-20B: An Open-Source Autoregressive Language Model , 2022, BIGSCIENCE.
[16] Andrew M. Dai,et al. PaLM: Scaling Language Modeling with Pathways , 2022, J. Mach. Learn. Res..
[17] Roy Schwartz,et al. Data Contamination: From Memorization to Exploitation , 2022, ACL.
[18] R. Shokri,et al. Quantifying Privacy Risks of Masked Language Models Using Membership Inference Attacks , 2022, EMNLP.
[19] Colin Raffel,et al. Deduplicating Training Data Mitigates Privacy Risks in Language Models , 2022, ICML.
[20] Florian Tramèr,et al. Counterfactual Memorization in Neural Language Models , 2021, NeurIPS.
[21] Quoc V. Le,et al. GLaM: Efficient Scaling of Language Models with Mixture-of-Experts , 2021, ICML.
[22] Florian Tramèr,et al. Membership Inference Attacks From First Principles , 2021, 2022 IEEE Symposium on Security and Privacy (SP).
[23] Graham Cormode,et al. On the Importance of Difficulty Calibration in Membership Inference Attacks , 2021, ICLR.
[24] Owain Evans,et al. TruthfulQA: Measuring How Models Mimic Human Falsehoods , 2021, ACL.
[25] Quoc V. Le,et al. Finetuned Language Models Are Zero-Shot Learners , 2021, ICLR.
[26] Melissa Chase,et al. Membership Inference on Word Embedding and Beyond , 2021, ArXiv.
[27] Seth Neel,et al. Adaptive Machine Unlearning , 2021, NeurIPS.
[28] Danqi Chen,et al. SimCSE: Simple Contrastive Learning of Sentence Embeddings , 2021, EMNLP.
[29] Hong Yu,et al. Membership Inference Attack Susceptibility of Clinical Language Models , 2021, ArXiv.
[30] Ananda Theertha Suresh,et al. Remember What You Want to Forget: Algorithms for Machine Unlearning , 2021, NeurIPS.
[31] Milad Nasr,et al. Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning , 2021, 2021 IEEE Symposium on Security and Privacy (SP).
[32] Charles Foster,et al. The Pile: An 800GB Dataset of Diverse Text for Language Modeling , 2020, ArXiv.
[33] Jiangchuan Liu,et al. Federated Unlearning , 2020, ArXiv.
[34] Tom B. Brown,et al. Extracting Training Data from Large Language Models , 2020, USENIX Security Symposium.
[35] Yinjun Wu,et al. DeltaGrad: Rapid retraining of machine learning models , 2020, ICML.
[36] Jonathan Ullman,et al. Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.
[37] Raef Bassily,et al. Stability of Stochastic Gradient Descent on Nonsmooth Convex Losses , 2020, NeurIPS.
[38] Mark Chen,et al. Language Models are Few-Shot Learners , 2020, NeurIPS.
[39] Kamalika Chaudhuri,et al. Approximate Data Deletion from Machine Learning Models: Algorithms and Evaluations , 2020, AISTATS.
[40] Santiago Zanella Béguelin,et al. Analyzing Information Leakage of Updates to Natural Language Models , 2019, CCS.
[41] Christopher A. Choquette-Choo,et al. Machine Unlearning , 2019, 2021 IEEE Symposium on Security and Privacy (SP).
[42] James Zou,et al. Making AI Forget You: Data Deletion in Machine Learning , 2019, NeurIPS.
[43] Matt Fredrikson,et al. Stolen Memories: Leveraging Model Memorization for Calibrated White-Box Membership Inference , 2019, USENIX Security Symposium.
[44] Vitaly Feldman,et al. Does learning require memorization? a short tale about a long tail , 2019, STOC.
[45] Ming-Wei Chang,et al. BoolQ: Exploring the Surprising Difficulty of Natural Yes/No Questions , 2019, NAACL.
[46] David Evans,et al. Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.
[47] Vitaly Shmatikov,et al. Auditing Data Provenance in Text-Generation Models , 2018, KDD.
[48] Kai Chen,et al. Understanding Membership Inferences on Well-Generalized Learning Models , 2018, ArXiv.
[49] Somesh Jha,et al. Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).
[50] Paul Voigt,et al. The EU General Data Protection Regulation (GDPR) , 2017 .
[51] Vitaly Shmatikov,et al. Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).
[52] Yoram Singer,et al. Train faster, generalize better: Stability of stochastic gradient descent , 2015, ICML.
[53] Christopher Potts,et al. Learning Word Vectors for Sentiment Analysis , 2011, ACL.
[54] Chin-Yew Lin,et al. ROUGE: A Package for Automatic Evaluation of Summaries , 2004, ACL 2004.
[55] D. Bolinger. According to , 1990 .
[56] Huseyin A. Inan,et al. Membership Inference Attacks Against NLP Classification Models , 2021 .
[57] Jonathan Berant,et al. CommonsenseQA: A Question Answering Challenge Targeting Commonsense Knowledge , 2019, NAACL.