Differentially Private Data Release through Multidimensional Partitioning

Differential privacy is a strong notion for protecting individual privacy in privacy preserving data analysis or publishing. In this paper, we study the problem of differentially private histogram release based on an interactive differential privacy interface. We propose two multidimensional partitioning strategies including a baseline cell-based partitioning and an innovative kd-tree based partitioning. In addition to providing formal proofs for differential privacy and usefulness guarantees for linear distributive queries, we also present a set of experimental results and demonstrate the feasibility and performance of our method.

[1]  Jian Pei,et al.  Data Mining: Concepts and Techniques, 3rd edition , 2006 .

[2]  Dan Suciu,et al.  Boosting the Accuracy of Differentially-Private Queries Through Consistency , 2009, ArXiv.

[3]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[4]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[5]  Haim Kaplan,et al.  Private coresets , 2009, STOC '09.

[6]  Yannis E. Ioannidis,et al.  The History of Histograms (abridged) , 2003, VLDB.

[7]  Ashwin Machanavajjhala,et al.  Privacy in Search Logs , 2009, ArXiv.

[8]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[9]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[10]  Torsten Suel,et al.  On Rectangular Partitionings in Two Dimensions: Algorithms, Complexity, and Applications , 1999, ICDT.

[11]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[12]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[13]  Kunal Talwar,et al.  On the geometry of differential privacy , 2009, STOC '10.

[14]  Nina Mishra,et al.  Releasing search queries and clicks privately , 2009, WWW '09.

[15]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[16]  Philip S. Yu,et al.  Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[17]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[18]  Moni Naor,et al.  On the complexity of differentially private data release: efficient algorithms and hardness results , 2009, STOC '09.

[19]  Andrew McGregor,et al.  Optimizing linear counting queries under differential privacy , 2009, PODS.

[20]  Elisa Bertino,et al.  Private record matching using differential privacy , 2010, EDBT '10.

[21]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[22]  Jiawei Han,et al.  Data Mining: Concepts and Techniques, Second Edition , 2006, The Morgan Kaufmann series in data management systems.

[23]  Petra Perner,et al.  Data Mining - Concepts and Techniques , 2002, Künstliche Intell..

[24]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[25]  Tim Roughgarden,et al.  Interactive privacy via the median mechanism , 2009, STOC '10.

[26]  Vitaly Feldman,et al.  New Results for Learning Noisy Parities and Halfspaces , 2006, 2006 47th Annual IEEE Symposium on Foundations of Computer Science (FOCS'06).

[27]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .