Tight Auditing of Differentially Private Machine Learning

Auditing mechanisms for differential privacy use probabilistic means to empirically estimate the privacy level of an algorithm. For private machine learning, existing auditing mechanisms are tight: the empirical privacy estimate (nearly) matches the algorithm's provable privacy guarantee. But these auditing techniques suffer from two limitations. First, they only give tight estimates under implausible worst-case assumptions (e.g., a fully adversarial dataset). Second, they require thousands or millions of training runs to produce non-trivial statistical estimates of the privacy leakage. This work addresses both issues. We design an improved auditing scheme that yields tight privacy estimates for natural (not adversarially crafted) datasets -- if the adversary can see all model updates during training. Prior auditing works rely on the same assumption, which is permitted under the standard differential privacy threat model. This threat model is also applicable, e.g., in federated learning settings. Moreover, our auditing scheme requires only two training runs (instead of thousands) to produce tight privacy estimates, by adapting recent advances in tight composition theorems for differential privacy. We demonstrate the utility of our improved auditing schemes by surfacing implementation bugs in private machine learning code that eluded prior auditing techniques.

[1]  Edward Raff,et al.  A General Framework for Auditing Differentially Private Machine Learning , 2022, NeurIPS.

[2]  Alexandre Sablayrolles,et al.  CANIFE: Crafting Canaries for Empirical Privacy Measurement in Federated Learning , 2022, ArXiv.

[3]  Florian Tramèr,et al.  Measuring Forgetting of Memorized Training Examples , 2022, ICLR.

[4]  Shruti Tople,et al.  Bayesian Estimation of Differential Privacy , 2022, ArXiv.

[5]  Jason M. Altschuler,et al.  Privacy of Noisy Stochastic Gradient Descent: More Iterations without More Privacy Loss , 2022, NeurIPS.

[6]  Samuel L. Smith,et al.  Unlocking High-Accuracy Differentially Private Image Classification through Scale , 2022, ArXiv.

[7]  R. Shokri,et al.  Differentially Private Learning Needs Hidden State (Or Much Faster Convergence) , 2022, Neural Information Processing Systems.

[8]  Florian Tramèr,et al.  Debugging Differential Privacy: A Case Study for Privacy Auditing , 2022, ArXiv.

[9]  Florian Tramèr,et al.  Quantifying Memorization Across Neural Language Models , 2022, ICLR.

[10]  Joseph P. Near,et al.  Backpropagation Clipping for Deep Learning with Differential Privacy , 2022, ArXiv.

[11]  Borja Balle,et al.  Reconstructing Training Data with Informed Adversaries , 2022, 2022 IEEE Symposium on Security and Privacy (SP).

[12]  Florian Tramèr,et al.  Membership Inference Attacks From First Principles , 2021, 2022 IEEE Symposium on Security and Privacy (SP).

[13]  Milad Nasr,et al.  Adversary Instantiation: Lower Bounds for Differentially Private Machine Learning , 2021, 2021 IEEE Symposium on Security and Privacy (SP).

[14]  Jonathan Ullman,et al.  Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.

[15]  A. Honkela,et al.  Computing Tight Differential Privacy Guarantees Using FFT , 2019, AISTATS.

[16]  David Evans,et al.  Evaluating Differentially Private Machine Learning in Practice , 2019, USENIX Security Symposium.

[17]  Amir Houmansadr,et al.  Comprehensive Privacy Analysis of Deep Learning: Passive and Active White-box Inference Attacks against Centralized and Federated Learning , 2018, 2019 IEEE Symposium on Security and Privacy (SP).

[18]  Úlfar Erlingsson,et al.  The Secret Sharer: Evaluating and Testing Unintended Memorization in Neural Networks , 2018, USENIX Security Symposium.

[19]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[20]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[21]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[22]  Pramod Viswanath,et al.  The Composition Theorem for Differential Privacy , 2013, IEEE Transactions on Information Theory.

[23]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[24]  Nikos Komodakis,et al.  Wide Residual Networks , 2016, BMVC.

[25]  L. Wasserman,et al.  A Statistical Framework for Differential Privacy , 2008, 0811.2501.

[26]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[27]  S P Azen,et al.  OBTAINING CONFIDENCE INTERVALS FOR THE RISK RATIO IN COHORT STUDIES , 1978 .