Non-Asymptotic Lower Bounds For Training Data Reconstruction

Mathematical notions of privacy, such as differential privacy, are often stated as probabilistic guarantees that are difficult to interpret. It is imperative, however, that the implications of data sharing be effectively communicated to the data principal to ensure informed decision-making and offer full transparency with regards to the associated privacy risks. To this end, our work presents a rigorous quantitative evaluation of the protection conferred by private learners by investigating their resilience to training data reconstruction attacks. We accomplish this by deriving non-asymptotic lower bounds on the reconstruction error incurred by any adversary against $(\epsilon, \delta)$ differentially private learners for target samples that belong to any compact metric space. Working with a generalization of differential privacy, termed metric privacy, we remove boundedness assumptions on the input space prevalent in prior work, and prove that our results hold for general locally compact metric spaces. We extend the analysis to cover the high dimensional regime, wherein, the input data dimensionality may be larger than the adversary's query budget, and demonstrate that our bounds are minimax optimal under certain regimes.

[1]  Zhili Chen,et al.  SA-DPSGD: Differentially Private Stochastic Gradient Descent based on Simulated Annealing , 2022, ArXiv.

[2]  Edward Raff,et al.  A General Framework for Auditing Differentially Private Machine Learning , 2022, NeurIPS.

[3]  C. Canonne A short note on an inequality between KL and TV , 2022, 2202.07198.

[4]  Alexandre Sablayrolles,et al.  Defending against Reconstruction Attacks with Rényi Differential Privacy , 2022, ArXiv.

[5]  Kamalika Chaudhuri,et al.  Bounding Training Data Reconstruction in Private (Deep) Learning , 2022, International Conference on Machine Learning.

[6]  Borja Balle,et al.  Reconstructing Training Data with Informed Adversaries , 2022, 2022 IEEE Symposium on Security and Privacy (SP).

[7]  Gilad Yehudai,et al.  On the Optimal Memorization Power of ReLU Neural Networks , 2021, ICLR.

[8]  Sivaraman Balakrishnan,et al.  Heavy-tailed Streaming Statistical Estimation , 2021, AISTATS.

[9]  Annabelle McIver,et al.  The Laplace Mechanism has optimal utility for differential privacy over continuous queries , 2021, 2021 36th Annual ACM/IEEE Symposium on Logic in Computer Science (LICS).

[10]  Alec Radford,et al.  Zero-Shot Text-to-Image Generation , 2021, ICML.

[11]  Dan Boneh,et al.  Differentially Private Learning Needs Better Features (or Much More Data) , 2020, ICLR.

[12]  F. Kerschbaum,et al.  Differentially Private Learning Does Not Bound Membership Inference , 2020, ArXiv.

[13]  Salil Vadhan,et al.  Differentially Private Simple Linear Regression , 2020, Proc. Priv. Enhancing Technol..

[14]  Jonathan Ullman,et al.  Auditing Differentially Private Machine Learning: How Private is Private SGD? , 2020, NeurIPS.

[15]  Mark Chen,et al.  Language Models are Few-Shot Learners , 2020, NeurIPS.

[16]  G. A. Young,et al.  High‐dimensional Statistics: A Non‐asymptotic Viewpoint, Martin J.Wainwright, Cambridge University Press, 2019, xvii 552 pages, £57.99, hardback ISBN: 978‐1‐1084‐9802‐9 , 2020, International Statistical Review.

[17]  Guy Bresler,et al.  A Corrective View of Neural Networks: Representation, Memorization and Learning , 2020, COLT.

[18]  Annabelle McIver,et al.  Generalised Differential Privacy for Text Document Processing , 2018, POST.

[19]  R. Sarpong,et al.  Bio-inspired synthesis of xishacorenes A, B, and C, and a new congener from fuscol† †Electronic supplementary information (ESI) available. See DOI: 10.1039/c9sc02572c , 2019, Chemical science.

[20]  Vitaly Feldman,et al.  Privacy Amplification by Iteration , 2018, 2018 IEEE 59th Annual Symposium on Foundations of Computer Science (FOCS).

[21]  Somesh Jha,et al.  Privacy Risk in Machine Learning: Analyzing the Connection to Overfitting , 2017, 2018 IEEE 31st Computer Security Foundations Symposium (CSF).

[22]  C. Dwork,et al.  Exposed! A Survey of Attacks on Private Data , 2017, Annual Review of Statistics and Its Application.

[23]  Ilya Mironov,et al.  Rényi Differential Privacy , 2017, 2017 IEEE 30th Computer Security Foundations Symposium (CSF).

[24]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[25]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[26]  Guy N. Rothblum,et al.  Concentrated Differential Privacy , 2016, ArXiv.

[27]  George J. Pappas,et al.  Gradual Release of Sensitive Data under Differential Privacy , 2015, J. Priv. Confidentiality.

[28]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[29]  Anand D. Sarwate,et al.  Stochastic gradient descent with differentially private updates , 2013, 2013 IEEE Global Conference on Signal and Information Processing.

[30]  Catuscia Palamidessi,et al.  Broadening the Scope of Differential Privacy Using Metrics , 2013, Privacy Enhancing Technologies.

[31]  L. Deng,et al.  The MNIST Database of Handwritten Digit Images for Machine Learning Research [Best of the Web] , 2012, IEEE Signal Processing Magazine.

[32]  Aleksandar Nikolov,et al.  Optimal private halfspace counting via discrepancy , 2012, STOC '12.

[33]  Jim Hefferon,et al.  Linear Algebra , 2012 .

[34]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[35]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[36]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[37]  Cynthia Dwork,et al.  New Efficient Attacks on Statistical Disclosure Control Mechanisms , 2008, CRYPTO.

[38]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[39]  Christopher M. Bishop,et al.  Pattern Recognition and Machine Learning (Information Science and Statistics) , 2006 .

[40]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[41]  S. Kasiviswanathan,et al.  Balancing utility and scalability in metric differential privacy , 2022, UAI.

[42]  Yin Tat Lee,et al.  Network size and size of the weights in memorization with two-layers neural networks , 2020, NeurIPS.

[43]  W. Hager,et al.  and s , 2019, Shallow Water Hydraulics.

[44]  Salil P. Vadhan,et al.  The Complexity of Differential Privacy , 2017, Tutorials on the Foundations of Cryptography.

[45]  W. Marsden I and J , 2012 .

[46]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[47]  W. Rudin Principles of mathematical analysis , 1964 .