A Survey of Attack Techniques on Privacy-Preserving Data Perturbation Methods

We focus primarily on the use of additive and matrix multiplicative data perturbation techniques in privacy preserving data mining (PPDM). We survey a recent body of research aimed at better understanding the vulnerabilities of these techniques. These researchers assumed the role of an attacker and developed methods for estimating the original data from the perturbed data and any available prior knowledge. Finally, we briefly discuss research aimed at attacking k-anonymization, another data perturbation technique in PPDM.

[1]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[2]  Yingjiu Li,et al.  Deriving Private Information from Perturbed Data Using IQR Based Approach , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[3]  Wenliang Du,et al.  Deriving private information from randomized data , 2005, SIGMOD '05.

[4]  Stephen E. Fienberg,et al.  Data Swapping: Variations on a Theme by Dalenius and Reiss , 2004, Privacy in Statistical Databases.

[5]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[6]  Elisa Bertino,et al.  Association rule hiding , 2004, IEEE Transactions on Knowledge and Data Engineering.

[7]  Keke Chen,et al.  Towards Attack-Resilient Geometric Data Perturbation , 2007, SDM.

[8]  John A. Nelder,et al.  A Simplex Method for Function Minimization , 1965, Comput. J..

[9]  Philip S. Yu,et al.  Privacy-Preserving Data Mining - Models and Algorithms , 2008, Advances in Database Systems.

[10]  Audra E. Kosh,et al.  Linear Algebra and its Applications , 1992 .

[11]  Keke Chen,et al.  Privacy preserving data classification with rotation perturbation , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[12]  I. Jolliffe Principal Component Analysis , 2002 .

[13]  GangopadhyayAryya,et al.  A privacy-preserving technique for Euclidean distance-based mining algorithms using Fourier-related transforms , 2006, VLDB 2006.

[14]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[15]  Dag Jonsson Some limit theorems for the eigenvalues of a sample covariance matrix , 1982 .

[16]  Kun Liu,et al.  An Attacker's View of Distance Preserving Maps for Privacy Preserving Data Mining , 2006, PKDD.

[17]  Philip S. Yu,et al.  Handicapping attacker's confidence: an alternative to k-anonymization , 2006, Knowledge and Information Systems.

[18]  G. Székely,et al.  TESTING FOR EQUAL DISTRIBUTIONS IN HIGH DIMENSION , 2004 .

[19]  Jay-J. Kim A METHOD FOR LIMITING DISCLOSURE IN MICRODATA BASED ON RANDOM NOISE AND , 2002 .

[20]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[21]  Lei Liu,et al.  Optimal randomization for privacy preserving data mining , 2004, KDD.

[22]  Yingjiu Li,et al.  On the Lower Bound of Reconstruction Error for Spectral Filtering Based Privacy Preserving Data Mining , 2006, PKDD.

[23]  Osmar R. Zaïane,et al.  Privacy Preserving Clustering by Data Transformation , 2010, J. Inf. Data Manag..

[24]  Xintao Wu,et al.  Deriving Private Information from Arbitrarily Projected Data , 2007, PAKDD.

[25]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[26]  Charu C. Aggarwal,et al.  On k-Anonymity and the Curse of Dimensionality , 2005, VLDB.

[27]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[28]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[29]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[30]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[31]  Ruth Brand,et al.  Microdata Protection through Noise Addition , 2002, Inference Control in Statistical Databases.

[32]  Josep Domingo-Ferrer,et al.  On the Security of Noise Addition for Privacy in Statistical Databases , 2004, Privacy in Statistical Databases.

[33]  Patrick L. Combettes,et al.  Signal detection via spectral theory of large dimensional random matrices , 1992, IEEE Trans. Signal Process..

[34]  Ran Wolff,et al.  The VLDB Journal manuscript No. (will be inserted by the editor) Providing k-Anonymity in Data Mining , 2022 .

[35]  Osmar R. Zaïane,et al.  Achieving Privacy Preservation when Sharing Data for Clustering , 2004, Secure Data Management.

[36]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[37]  Xintao Wu,et al.  On the use of spectral filtering for privacy preserving data mining , 2006, SAC '06.

[38]  Rathindra Sarathy,et al.  Data Shuffling - A New Masking Approach for Numerical Data , 2006, Manag. Sci..

[39]  Aryya Gangopadhyay,et al.  A privacy-preserving technique for Euclidean distance-based mining algorithms using Fourier-related transforms , 2006, The VLDB Journal.

[40]  Chong K. Liew,et al.  A data distortion by probability distribution , 1985, TODS.

[41]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[42]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[43]  V. N. Bogaevski,et al.  Matrix Perturbation Theory , 1991 .

[44]  P. Tendick Optimal noise addition for preserving confidentiality in multivariate data , 1991 .

[45]  Erkki Oja,et al.  Independent component analysis: algorithms and applications , 2000, Neural Networks.

[46]  Sumit Sarkar,et al.  A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[47]  E. Wigner,et al.  On the statistical distribution of the widths and spacings of nuclear resonance levels , 1951, Mathematical Proceedings of the Cambridge Philosophical Society.

[48]  Stephen E. Fienberg,et al.  Additive noise and multiplicative bias as disclosure limitation techniques for continuous microdata: A simulation study , 2004, J. Comput. Methods Sci. Eng..

[49]  William E. Winkler,et al.  Multiplicative Noise for Masking Continuous Data , 2001 .