论文信息 - Generalization Performances of Randomized Classifiers and Algorithms built on Data Dependent Distributions

Generalization Performances of Randomized Classifiers and Algorithms built on Data Dependent Distributions

In this paper we prove that a randomized algorithm based on the data generating dependent prior and data dependent posterior Boltzmann distributions of Catoni (2007) is Differentially Private (DP) and shows better generalization properties than the Gibbs (randomized) classifier associated to the same distributions. For this purpose, we will develop a tight DP-based generalization bound, which improve over the current state-of-the-art Hoeffding-type bound.

Davide Anguita | Luca Oneto | Sandro Ridella

[1] Stephen E. Fienberg,et al. Learning with Differential Privacy: Stability, Learnability and the Sufficiency and Necessity of ERM Principle , 2015, J. Mach. Learn. Res..

[2] Toniann Pitassi,et al. Generalization in Adaptive Data Analysis and Holdout Reuse , 2015, NIPS.

[3] Davide Anguita,et al. PAC-bayesian analysis of distribution dependent priors: Tighter risk bounds and stability analysis , 2016, Pattern Recognit. Lett..

[4] Davide Anguita,et al. Tuning the Distribution Dependent Prior in the PAC-Bayes Framework based on Empirical Data , 2016, ESANN.

[5] E. S. Pearson,et al. THE USE OF CONFIDENCE OR FIDUCIAL LIMITS ILLUSTRATED IN THE CASE OF THE BINOMIAL , 1934 .

[6] Toniann Pitassi,et al. Preserving Statistical Validity in Adaptive Data Analysis , 2014, STOC.

[7] François Laviolette,et al. PAC-Bayesian learning of linear classifiers , 2009, ICML '09.

[8] O. Catoni. PAC-BAYESIAN SUPERVISED CLASSIFICATION: The Thermodynamics of Statistical Learning , 2007, 0712.0248.

[9] François Laviolette,et al. Risk bounds for the majority vote: from a PAC-Bayesian analysis to a learning algorithm , 2015, J. Mach. Learn. Res..

[10] John Shawe-Taylor,et al. Tighter PAC-Bayes bounds through distribution-dependent priors , 2013, Theor. Comput. Sci..

[11] Aaron Roth,et al. The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[12] H. Chernoff. A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .