论文信息 - Boosting and Differential Privacy - 字舞流文

Boosting and Differential Privacy

Boosting is a general method for improving the accuracy of learning algorithms. We use boosting to construct improved {\em privacy-preserving synopses} of an input database. These are data structures that yield, for a given set $\Q$ of queries over an input database, reasonably accurate estimates of the responses to every query in~$\Q$, even when the number of queries is much larger than the number of rows in the database. Given a {\em base synopsis generator} that takes a distribution on $\Q$ and produces a ``weak'' synopsis that yields ``good'' answers for a majority of the weight in $\Q$, our {\em Boosting for Queries} algorithm obtains a synopsis that is good for all of~$\Q$. We ensure privacy for the rows of the database, but the boosting is performed on the {\em queries}. We also provide the first synopsis generators for arbitrary sets of arbitrary low-sensitivity queries, {\it i.e.}, queries whose answers do not vary much under the addition or deletion of a single row. In the execution of our algorithm certain tasks, each incurring some privacy loss, are performed many times. To analyze the cumulative privacy loss, we obtain an $O(\eps^2)$ bound on the {\em expected} privacy loss from a single $\eps$-\dfp{} mechanism. Combining this with evolution of confidence arguments from the literature, we get stronger bounds on the expected cumulative privacy loss due to multiple mechanisms, each of which provides $\eps$-differential privacy or one of its relaxations, and each of which operates on (potentially) different, adaptively chosen, databases.

Guy N. Rothblum | Cynthia Dwork | Salil P. Vadhan | C. Dwork | S. Vadhan | G. Rothblum

[1] Thomas M. Cover,et al. Elements of Information Theory , 2005 .

[2] Yoav Freund,et al. An improved boosting algorithm and its implications on learning complexity , 1992, COLT '92.

[3] Yoav Freund,et al. A decision-theoretic generalization of on-line learning and an application to boosting , 1995, EuroCOLT.

[4] Yoram Singer,et al. Improved Boosting Algorithms Using Confidence-rated Predictions , 1998, COLT' 98.

[5] Mihir Bellare,et al. Relations among Notions of Security for Public-Key Encryption Schemes , 1998, IACR Cryptol. ePrint Arch..

[6] Irit Dinur,et al. Revealing information while preserving privacy , 2003, PODS.

[7] Robert E. Schapire,et al. The Boosting Approach to Machine Learning An Overview , 2003 .

[8] Cynthia Dwork,et al. Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[9] T. Tao,et al. The primes contain arbitrarily long arithmetic progressions , 2004, math/0404188.

[10] Cynthia Dwork,et al. Practical privacy: the SuLQ framework , 2005, PODS.

[11] Moni Naor,et al. Our Data, Ourselves: Privacy Via Distributed Noise Generation , 2006, EUROCRYPT.

[12] T. Tao,et al. The primes contain arbitrarily long polynomial progressions , 2006, math/0610050.

[13] Cynthia Dwork,et al. Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[14] Cynthia Dwork,et al. An Ad Omnia Approach to Defining and Achieving Private Data Analysis , 2007, PinKDD.

[15] Madhur Tulsiani,et al. Dense Subsets of Pseudorandom Sets , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[16] A. Blum,et al. A learning theory approach to non-interactive database privacy , 2008, STOC.

[17] Sofya Raskhodnikova,et al. What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[18] Kamalika Chaudhuri,et al. Privacy-preserving logistic regression , 2008, NIPS.

[19] Tim Roughgarden,et al. The Median Mechanism: Interactive and Efficient Privacy with Multiple Queries , 2009, ArXiv.

[20] Moni Naor,et al. On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[21] Ashwin Machanavajjhala,et al. Privacy in Search Logs , 2009, ArXiv.

[22] Cynthia Dwork,et al. Differential privacy and robust statistics , 2009, STOC '09.

[23] Omer Reingold,et al. Computational Differential Privacy , 2009, CRYPTO.

[24] Boaz Barak,et al. The uniform hardcore lemma via approximate Bregman projections , 2009, SODA.

[25] Frank McSherry. Privacy integrated queries , 2010, Commun. ACM.

[26] Guy N. Rothblum,et al. A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[27] C. Dwork. A firm foundation for private data analysis , 2011, Commun. ACM.

[28] Ling Huang,et al. Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.