Privacy via Maintaining Small Similitude Data for Big Data Statistical Representation

Despite its attractiveness, Big Data oftentimes is hard, slow and expensive to handle due to its size. Moreover, as the amount of collected data grows, individual privacy raises more and more concerns: “what do they know about me?” Different algorithms were suggested to enable privacy-preserving data release with the current de-facto standard differential privacy. However, the processing time of keeping the data private is inhibiting and currently not practical for every day use. Combined with the continuously growing data collection, the solution is not seen on a horizon.

[1]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[2]  Johannes Gehrke,et al.  Differential privacy via wavelet transforms , 2009, 2010 IEEE 26th International Conference on Data Engineering (ICDE 2010).

[3]  Guy N. Rothblum,et al.  A Multiplicative Weights Mechanism for Privacy-Preserving Data Analysis , 2010, 2010 IEEE 51st Annual Symposium on Foundations of Computer Science.

[4]  S. Muthukrishnan,et al.  Optimal and approximate computation of summary statistics for range aggregates , 2001, PODS '01.

[5]  Claude Castelluccia,et al.  Differentially Private Histogram Publishing through Lossy Compression , 2012, 2012 IEEE 12th International Conference on Data Mining.

[6]  Marco Gaboardi,et al.  Dual Query: Practical Private Query Release for High Dimensional Data , 2014, ICML.

[7]  Jonathan Ullman,et al.  PCPs and the Hardness of Generating Private Synthetic Data , 2011, TCC.

[8]  Jeffrey Scott Vitter,et al.  Data cube approximation and histograms via wavelets , 1998, CIKM '98.

[9]  Aaron Roth,et al.  A learning theory approach to noninteractive database privacy , 2011, JACM.

[10]  Jeffrey Scott Vitter,et al.  Approximate computation of multidimensional aggregates of sparse data using wavelets , 1999, SIGMOD '99.

[11]  Dan Suciu,et al.  Boosting the accuracy of differentially private histograms through consistency , 2009, Proc. VLDB Endow..

[12]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[13]  Ninghui Li,et al.  Understanding Hierarchical Methods for Differentially Private Histograms , 2013, Proc. VLDB Endow..

[14]  Amit Kumar,et al.  Deterministic wavelet thresholding for maximum-error metrics , 2004, PODS.

[15]  Charu C. Aggarwal,et al.  Data Streams - Models and Algorithms , 2014, Advances in Database Systems.

[16]  David Salesin,et al.  Wavelets for computer graphics: theory and applications , 1996 .

[17]  Suman Nath,et al.  Differentially private aggregation of distributed time-series with transformation and encryption , 2010, SIGMOD Conference.

[18]  Kyuseok Shim,et al.  Approximate query processing using wavelets , 2001, The VLDB Journal.

[19]  Jeffrey Scott Vitter,et al.  Wavelet-based histograms for selectivity estimation , 1998, SIGMOD '98.

[20]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[21]  Jonathan Ullman,et al.  Answering n{2+o(1)} counting queries with differential privacy is hard , 2012, STOC '13.

[22]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[23]  Jun Zhang,et al.  PrivBayes: private data release via bayesian networks , 2014, SIGMOD Conference.

[24]  Katrina Ligett,et al.  A Simple and Practical Algorithm for Differentially Private Data Release , 2010, NIPS.