An Attack on the Privacy of Sanitized Data that Fuses the Outputs of Multiple Data Miners

Data sanitization has been used to restrict re-identification of individuals and disclosure of sensitive information from published data. We propose an attack on the privacy of the published sanitized data that simply fuses outputs of multiple data miners that are applied to the sanitized data. That attack is practical and does not require any background or additional information. We use a number of experiments to show scenarios where an adversary can combine outputs of multiple miners using a simple fusion strategy to increase their success chance of breaching privacy of individuals whose data is stored in the database. The fusion attack provides a powerful method of breaching privacy in the form of partial disclosure, for both anonymized and perturbed data. It also provides an effective way of approximating predictions of the best miner (a miner that provides the best results among all considered miners) when this miner cannot be determined.

[1]  Raj Acharya,et al.  On breaching enterprise data privacy through adversarial information fusion , 2008, 2008 IEEE 24th International Conference on Data Engineering Workshop.

[2]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[3]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[4]  Sushil Jajodia,et al.  Secure Data Management in Decentralized Systems , 2014, Secure Data Management in Decentralized Systems.

[5]  I. R. Goodman,et al.  Mathematics of Data Fusion , 1997 .

[6]  Cynthia Dwork,et al.  New Efficient Attacks on Statistical Disclosure Control Mechanisms , 2008, CRYPTO.

[7]  Traian Marius Truta,et al.  Protection : p-Sensitive k-Anonymity Property , 2006 .

[8]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[9]  Vicenc Torra,et al.  Information Fusion in Data Mining , 2003 .

[10]  Chris Clifton,et al.  Multirelational k-Anonymity , 2007, IEEE Transactions on Knowledge and Data Engineering.

[11]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[12]  Li Liu,et al.  The applicability of the perturbation based privacy preserving data mining for real-world data , 2008, Data Knowl. Eng..

[13]  Wenliang Du,et al.  Inference Analysis in Privacy-Preserving Data Re-publishing , 2008, 2008 Eighth IEEE International Conference on Data Mining.

[14]  Irit Dinur,et al.  Revealing information while preserving privacy , 2003, PODS.

[15]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Elena Console,et al.  Data Fusion , 2009, Encyclopedia of Database Systems.

[17]  Elisa Bertino,et al.  Secure Anonymization for Incremental Datasets , 2006, Secure Data Management.

[18]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, ACM Comput. Surv..

[19]  Chris Clifton,et al.  Thoughts on k-Anonymization , 2006, 22nd International Conference on Data Engineering Workshops (ICDEW'06).

[20]  Vicenç Torra,et al.  Modeling decisions - information fusion and aggregation operators , 2007 .

[21]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[22]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[23]  Elisa Bertino,et al.  Efficient k -Anonymization Using Clustering Techniques , 2007, DASFAA.

[24]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[25]  Vijay S. Iyengar,et al.  Transforming data to satisfy privacy constraints , 2002, KDD.

[26]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[27]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[28]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[29]  Josep Domingo-Ferrer,et al.  A Critique of k-Anonymity and Some of Its Enhancements , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[30]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[31]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.