Efficient, Noise-Tolerant, and Private Learning via Boosting

We introduce a simple framework for designing private boosting algorithms. We give natural conditions under which these algorithms are differentially private, efficient, and noise-tolerant PAC learners. To demonstrate our framework, we use it to construct noise-tolerant and private PAC learners for large-margin halfspaces whose sample complexity does not depend on the dimension. We give two sample complexity bounds for our large-margin halfspace learner. One bound is based only on differential privacy, and uses this guarantee as an asset for ensuring generalization. This first bound illustrates a general methodology for obtaining PAC learners from privacy, which may be of independent interest. The second bound uses standard techniques from the theory of large-margin classification (the fat-shattering dimension) to match the best known sample complexity for differentially private learning of large-margin halfspaces, while additionally tolerating random label noise.

[1]  Yoav Freund,et al.  Game theory, on-line prediction and boosting , 1996, COLT '96.

[2]  Christopher Jung,et al.  A new analysis of differential privacy’s generalization guarantees (invited paper) , 2019, ITCS.

[3]  Raef Bassily,et al.  Algorithmic stability for adaptive data analysis , 2015, STOC.

[4]  Elad Hazan,et al.  Introduction to Online Convex Optimization , 2016, Found. Trends Optim..

[5]  Shai Shalev-Shwartz,et al.  Online Learning and Online Convex Optimization , 2012, Found. Trends Mach. Learn..

[6]  Tim Roughgarden,et al.  Privately Solving Linear Programs , 2014, ICALP.

[7]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[8]  Cynthia Dwork,et al.  Differential privacy and robust statistics , 2009, STOC '09.

[9]  Russell Impagliazzo,et al.  Hard-core distributions for somewhat hard problems , 1995, Proceedings of IEEE 36th Annual Foundations of Computer Science.

[10]  D. Angluin,et al.  Learning From Noisy Examples , 1988, Machine Learning.

[11]  Zhi-Hua Zhou,et al.  Approximation Stability and Boosting , 2010, ALT.

[12]  Huy L. Nguyen,et al.  Efficient Private Algorithms for Learning Halfspaces , 2019, ArXiv.

[13]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[14]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[15]  Osamu Watanabe,et al.  MadaBoost: A Modification of AdaBoost , 2000, COLT.

[16]  Boaz Barak,et al.  The uniform hardcore lemma via approximate Bregman projections , 2009, SODA.

[17]  John Shawe-Taylor,et al.  Generalization Performance of Support Vector Machines and Other Pattern Classifiers , 1999 .

[18]  Alexander Rakhlin,et al.  Lecture Notes on Online Learning DRAFT , 2009 .

[19]  Rocco A. Servedio,et al.  Smooth Boosting and Learning with Malicious Noise , 2001, J. Mach. Learn. Res..

[20]  Rocco A. Servedio,et al.  PAC Analogues of Perceptron and Winnow Via Boosting the Margin , 2000, Machine Learning.

[21]  Justin Hsu,et al.  Differential privacy for the analyst via private equilibrium computation , 2012, STOC '13.

[22]  Thomas Steinke,et al.  Concentrated Differential Privacy: Simplifications, Extensions, and Lower Bounds , 2016, TCC.

[23]  Leslie G. Valiant,et al.  Learning Disjunction of Conjunctions , 1985, IJCAI.

[24]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[25]  P. Bartlett,et al.  Generalization Performance of Support Vector Machines and Other Pattern Classifiers , 1999 .