Optimal private halfspace counting via discrepancy

A range counting problem is specified by a set P of size |P| = n of points in Rd, an integer weight xp associated to each point p ∈ P, and a range space R ⊆ 2P. Given a query range R ∈ R, the output is R(x) = ∑p ∈ Rxp. The average squared error of an algorithm A is 1/|R|∑R ∈ R((A(R, x) - R(x)))2. Range counting for different range spaces is a central problem in Computational Geometry. We study (ε, δ)-differentially private algorithms for range counting. Our main results are for the range space given by hyperplanes, that is, the halfspace counting problem. We present an (ε, δ)-differentially private algorithm for halfspace counting in d dimensions which is O(n1-1/d) approximate for average squared error. This contrasts with the Ω(n) lower bound established by the classical result of Dinur and Nissim on approximation for arbitrary subset counting queries. We also show a matching lower bound of Ω(n1-1/d) approximation for any (ε, δ)-differentially private algorithm for halfspace counting. Both bounds are obtained using discrepancy theory. For the lower bound, we use a modified discrepancy measure and bound approximation of (ε, δ)-differentially private algorithms for range counting queries in terms of this discrepancy. We also relate the modified discrepancy measure to classical combinatorial discrepancy, which allows us to exploit known discrepancy lower bounds. This approach also yields a lower bound of Ω((log n)d-1) for (ε, δ)-differentially private orthogonal range counting in d dimensions, the first known superconstant lower bound for this problem. For the upper bound, we use an approach inspired by partial coloring methods for proving discrepancy upper bounds, and obtain (ε, δ)-differentially private algorithms for range counting with polynomially bounded shatter function range spaces.

[1]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, ICALP.

[2]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.

[3]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[4]  József Beck Balanced two-colorings of finite sets in the cube , 1989, Discret. Math..

[5]  József Beck,et al.  Balanced two-colorings of finite sets in the square I , 1981, Comb..

[6]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[7]  M. Sharir,et al.  An elementary approach to lower bounds in geometric discrepancy , 1995, Discret. Comput. Geom..

[8]  Anindya De,et al.  Lower Bounds in Differential Privacy , 2011, TCC.

[9]  Adam D. Smith,et al.  The price of privately releasing contingency tables and the spectra of random matrices with correlated rows , 2010, STOC '10.

[10]  Pankaj K. Agarwal,et al.  Geometric Range Searching and Its Relatives , 2007 .

[11]  Bernard Chazelle,et al.  A trace bound for the hereditary discrepancy , 2000, SCG '00.

[12]  Aaron Roth,et al.  Iterative Constructions and Private Data Release , 2011, TCC.

[13]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[14]  Thomas Ottmann,et al.  Data Structures and Efficient Algorithms, Final Report on the DFG Special Joint Initiative , 1992, Data Structures and Efficient Algorithm.

[15]  Andrew Chi-Chih Yao,et al.  A general approach to d-dimensional geometric queries , 1985, STOC '85.

[16]  Jirí Matousek,et al.  Tight upper bounds for the discrepancy of half-spaces , 1995, Discret. Comput. Geom..

[17]  Cynthia Dwork,et al.  The price of privacy and the limits of LP decoding , 2007, STOC '07.

[18]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[19]  Bernard Chazelle,et al.  The discrepancy method - randomness and complexity , 2000 .

[20]  K. F. Roth On irregularities of distribution , 1954 .

[21]  Bernard Chazelle,et al.  The Discrepancy Method , 1998, ISAAC.

[22]  Guy N. Rothblum,et al.  Boosting and Differential Privacy , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[23]  Emo Welzl,et al.  On Spanning Trees with Low Crossing Numbers , 1992, Data Structures and Efficient Algorithms.

[24]  Aaron Roth Differential Privacy and the Fat-Shattering Dimension of Linear Queries , 2010, APPROX-RANDOM.

[25]  Bernard Chazelle,et al.  Quasi-optimal range searching in spaces of finite VC-dimension , 1989, Discret. Comput. Geom..

[26]  Aaron Roth,et al.  Privately Releasing Conjunctions and the Statistical Query Barrier , 2013, SIAM J. Comput..

[27]  Cynthia Dwork,et al.  New Efficient Attacks on Statistical Disclosure Control Mechanisms , 2008, CRYPTO.

[28]  J. van Leeuwen,et al.  Discrete and Computational Geometry , 2002, Lecture Notes in Computer Science.

[29]  J. Spencer Six standard deviations suffice , 1985 .

[30]  J. Matousek,et al.  Geometric Discrepancy: An Illustrated Guide , 2009 .

[31]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[32]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[33]  David Haussler,et al.  Sphere Packing Numbers for Subsets of the Boolean n-Cube with Bounded Vapnik-Chervonenkis Dimension , 1995, J. Comb. Theory, Ser. A.

[34]  Timothy M. Chan Optimal Partition Trees , 2012, Discret. Comput. Geom..

[35]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[36]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.