Differentially Private Set-Valued Data Release against Incremental Updates

Publication of the private set-valued data will provide enormous opportunities for counting queries and various data mining tasks. Compared to those previous methods based on partition-based privacy models (e.g., k-anonymity), differential privacy provides strong privacy guarantees against adversaries with arbitrary background knowledge. However, the existing solutions based on differential privacy for data publication are currently limited to static datasets, and do not adequately address today’s demand for up-to-date information. In this paper, we address the problem of differentially private set-valued data release on an incremental scenario in which the data need to be transformed are not static. Motivated by this, we propose an efficient algorithm, called IncTDPart, to incrementally generate a series of differentially private releases. The proposed algorithm is based on top-down partitioning model with the help of item-free taxonomy tree and update-bounded mechanism. Extensive experiments on real datasets confirm that our approach maintains high utility and scalability for counting query.

[1]  Elisa Bertino,et al.  Secure Anonymization for Incremental Datasets , 2006, Secure Data Management.

[2]  Panos Kalnis,et al.  Privacy-preserving anonymization of set-valued data , 2008, Proc. VLDB Endow..

[3]  Divesh Srivastava,et al.  Differentially private summaries for sparse data , 2012, ICDT '12.

[4]  Ashwin Machanavajjhala,et al.  Publishing Search Logs—A Comparative Study of Privacy Guarantees , 2012, IEEE Transactions on Knowledge and Data Engineering.

[5]  Yin Yang,et al.  Differentially private histogram publication , 2012, The VLDB Journal.

[6]  Robin Milner,et al.  On Observing Nondeterminism and Concurrency , 1980, ICALP.

[7]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[8]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[9]  Benjamin C. M. Fung,et al.  Differentially private transit data publication: a case study on the montreal transportation system , 2012, KDD.

[10]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[11]  Moni Naor,et al.  Differential privacy under continual observation , 2010, STOC '10.

[12]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[13]  Elaine Shi,et al.  Private and Continual Release of Statistics , 2010, TSEC.

[14]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[15]  Benjamin C. M. Fung,et al.  Publishing set-valued data via differential privacy , 2011, Proc. VLDB Endow..

[16]  Ashwin Machanavajjhala,et al.  No free lunch in data privacy , 2011, SIGMOD '11.

[17]  Johannes Gehrke,et al.  iReduct: differential privacy with reduced relative errors , 2011, SIGMOD '11.

[18]  Philip S. Yu,et al.  Can the Utility of Anonymized Data be Used for Privacy Breaches? , 2009, TKDD.

[19]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[20]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[21]  Jeffrey F. Naughton,et al.  Anonymization of Set-Valued Data via Top-Down, Local Generalization , 2009, Proc. VLDB Endow..