Differentially Private Confidence Intervals for Empirical Risk Minimization

The process of data mining with differential privacy produces results that are affected by two types of noise: sampling noise due to data collection and privacy noise that is designed to prevent the reconstruction of sensitive information. In this paper, we consider the problem of designing confidence intervals for the parameters of a variety of differentially private machine learning models. The algorithms can provide confidence intervals that satisfy differential privacy (as well as the more recently proposed concentrated differential privacy) and can be used with existing differentially private mechanisms that train models using objective perturbation and output perturbation.

[1]  Or Sheet Private Approximations of the 2nd-Moment Matrix Using Existing Techniques in Linear Regression , 2015 .

[2]  Ashwin Machanavajjhala,et al.  Differentially Private Regression Diagnostics , 2016, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[3]  Moni Naor,et al.  Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[4]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[5]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[6]  Ashwin Machanavajjhala,et al.  Differentially Private Significance Tests for Regression Coefficients , 2017, Journal of Computational and Graphical Statistics.

[7]  Vito D'Orazio,et al.  Differential Privacy for Social Science Inference , 2015 .

[8]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[9]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[10]  Raef Bassily,et al.  Differentially Private Empirical Risk Minimization: Efficient Algorithms and Tight Error Bounds , 2014, 1405.7085.

[11]  Daniel Kifer,et al.  A New Class of Private Chi-Square Hypothesis Tests , 2017, AISTATS.

[12]  Yin Yang,et al.  PrivGene: differentially private model fitting using genetic algorithms , 2013, SIGMOD '13.

[13]  Liwei Wang,et al.  Efficient Private ERM for Smooth Objectives , 2017, IJCAI.

[14]  Zhihua Zhang,et al.  Wishart Mechanism for Differentially Private Principal Components Analysis , 2015, AAAI.

[15]  Vishesh Karwa,et al.  Finite Sample Differentially Private Confidence Intervals , 2017, ITCS.

[16]  Di Wang,et al.  Differentially Private Empirical Risk Minimization Revisited: Faster and More General , 2018, NIPS.

[17]  Ryan M. Rogers,et al.  Differentially Private Chi-Squared Hypothesis Testing: Goodness of Fit and Independence Testing , 2016, ICML 2016.

[18]  Yue Wang,et al.  Differentially Private Hypothesis Testing, Revisited , 2015, ArXiv.

[19]  G. Crooks On Measures of Entropy and Information , 2015 .

[20]  Hongxia Jin,et al.  Private Incremental Regression , 2017, PODS.

[21]  Prateek Jain,et al.  Differentially Private Learning with Kernels , 2013, ICML.

[22]  Hongxia Jin,et al.  Efficient Private Empirical Risk Minimization for High-dimensional Learning , 2016, ICML.

[23]  Seth Neel,et al.  Accuracy First: Selecting a Differential Privacy Level for Accuracy Constrained ERM , 2017, NIPS.

[24]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[25]  Prateek Jain,et al.  (Near) Dimension Independent Risk Bounds for Differentially Private Learning , 2014, ICML.

[26]  Marco Gaboardi,et al.  Efficient Empirical Risk Minimization with Smooth Loss Functions in Non-interactive Local Differential Privacy , 2018, ArXiv.

[27]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[28]  Jeffrey F. Naughton,et al.  Revisiting Differentially Private Regression: Lessons From Learning Theory and their Consequences , 2015, ArXiv.

[29]  Assaf Schuster,et al.  Data mining with differential privacy , 2010, KDD.

[30]  Li Zhang,et al.  Analyze gauss: optimal bounds for privacy-preserving principal component analysis , 2014, STOC.

[31]  Anand D. Sarwate,et al.  Near-optimal Differentially Private Principal Components , 2012, NIPS.

[32]  Olivier Chapelle,et al.  Training a Support Vector Machine in the Primal , 2007, Neural Computation.

[33]  Li Zhang,et al.  Nearly Optimal Private LASSO , 2015, NIPS.

[34]  Or Sheffet,et al.  Differentially Private Ordinary Least Squares , 2015, ICML.

[35]  Ling Huang,et al.  Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.

[36]  Li Zhang,et al.  Private Empirical Risk Minimization Beyond the Worst Case: The Effect of the Constraint Set Geometry , 2014, ArXiv.

[37]  Stephen E. Fienberg,et al.  Scalable privacy-preserving data sharing methodology for genome-wide association studies , 2014, J. Biomed. Informatics.

[38]  Huanyu Zhang,et al.  Differentially Private Testing of Identity and Closeness of Discrete Distributions , 2017, NeurIPS.

[39]  Constantinos Daskalakis,et al.  Priv'IT: Private and Sample Efficient Identity Testing , 2017, ICML.

[40]  Jun Sakuma,et al.  Differentially Private Chi-squared Test by Unit Circle Mechanism , 2017, ICML.

[41]  Stephen E. Fienberg,et al.  Differentially-Private Logistic Regression for Detecting Multiple-SNP Association in GWAS Databases , 2014, Privacy in Statistical Databases.

[42]  Yin Yang,et al.  Functional Mechanism: Regression Analysis under Differential Privacy , 2012, Proc. VLDB Endow..