Differentially Private Integer Partitions and their Applications

Given a positive integer N ≥ 0 a partition of N is a non-increasing sequence of numbers x1 ≥ x2 . . . ≥ xN ≥ 0 such that x1 + . . . + xN = N . We say that two partitions x and y are neighbors if the L1 distance between them is at most 1/2. Blocki et al. [BDB16] recently showed that there is a ( , δ)differentially private algorithm which (whp) achieves L1 error O( √ N/ ) and they used their algorithm to publish password frequency data from a password dataset of 70 million Yahoo! users [Bon12]. Their algorithm, which was based on an (approximate) instantiation of the exponential mechanism [MT07], is computationally efficient in 1 , log ( 1 δ ) and N . The applications of the mechanism of Blocki et al. [BDB16] are not limited to passwords. For example, the degree distribution of a social network G is simply a partition of the integer 2 |E(G)|. Thus, the mechanism of [BDB16] could be used to preserve differential privacy when releasing the degree distribution. It is particularly important to understand the performance of the exponential mechanism for integer partitions. We provide a pure -differentially instantiation of the exponential mechanism whenever there is an a priori known upper bound on N . We also upper bound the mean squared error of the exponential mechanism O (√ N log N 2 ) . For comparison, the best known results, due to Hay et al. [HLMJ09], achieved mean squared error O (√ N log N 2 ) . Additionally, we conjecture that the L1 error of the exponential mechanism scales with 1/ √ instead of 1/ . Empirical data from the RockYou password frequency dataset supports this conjecture. The conjecture, if true, could lead to the development of several useful node-differentially private algorithms.

[1]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[2]  Sofya Raskhodnikova,et al.  Private analysis of graph structure , 2011, Proc. VLDB Endow..

[3]  G. Hardy,et al.  Asymptotic Formulaæ in Combinatory Analysis , 1918 .

[4]  Jonathan Ullman,et al.  Answering n{2+o(1)} counting queries with differential privacy is hard , 2012, STOC '13.

[5]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[6]  David D. Jensen,et al.  Accurate Estimation of the Degree Distribution of Private Networks , 2009, 2009 Ninth IEEE International Conference on Data Mining.

[7]  Avrim Blum,et al.  Differentially private data analysis of social networks via restricted sensitivity , 2012, ITCS '13.

[8]  Sofya Raskhodnikova,et al.  Analyzing Graphs with Node Differential Privacy , 2013, TCC.

[9]  Aaron Roth,et al.  A learning theory approach to non-interactive database privacy , 2008, STOC.

[10]  Joseph Bonneau,et al.  The Science of Guessing: Analyzing an Anonymized Corpus of 70 Million Passwords , 2012, 2012 IEEE Symposium on Security and Privacy.

[11]  Shuigeng Zhou,et al.  Recursive mechanism: towards node differential privacy and unrestricted joins , 2013, SIGMOD '13.

[12]  Sofya Raskhodnikova,et al.  Efficient Lipschitz Extensions for High-Dimensional Graph Statistics and Node Private Degree Distributions , 2015, ArXiv.

[13]  Differentially Private Password Frequency Lists Or , How to release statistics from 70 million passwords ( on purpose ) , 2015 .

[14]  Anupam Datta,et al.  CASH: A Cost Asymmetric Secure Hash Algorithm for Optimal Password Protection , 2015, 2016 IEEE 29th Computer Security Foundations Symposium (CSF).

[15]  Aleksandra B. Slavkovic,et al.  Differentially Private Graphical Degree Sequences and Synthetic Graphs , 2012, Privacy in Statistical Databases.

[16]  Avrim Blum,et al.  The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy , 2012, 2012 IEEE 53rd Annual Symposium on Foundations of Computer Science.