Individual Privacy vs Population Privacy: Learning to Attack Anonymization

Over the last decade there have been great strides made in developing techniques to compute functions privately. In particular, Differential Privacy gives strong promises about conclusions that can be drawn about an individual. In contrast, various syntactic methods for providing privacy (criteria such as kanonymity and l-diversity) have been criticized for still allowing private information of an individual to be inferred. In this report, we consider the ability of an attacker to use data meeting privacy definitions to build an accurate classifier. We demonstrate that even under Differential Privacy, such classifiers can be used to accurately infer "private" attributes in realistic data. We compare this to similar approaches for inferencebased attacks on other forms of anonymized data. We place these attacks on the same scale, and observe that the accuracy of inference of private attributes for Differentially Private data and l-diverse data can be quite similar.

[1]  Adam D. Smith,et al.  Composition attacks and auxiliary information in data privacy , 2008, KDD.

[2]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[3]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[4]  Philippe Golle,et al.  Revisiting the uniqueness of simple demographics in the US population , 2006, WPES '06.

[5]  Vitaly Shmatikov,et al.  The cost of privacy: destruction of data-mining utility in anonymized data publishing , 2008, KDD.

[6]  Raymond Chi-Wing Wong,et al.  Minimality Attack in Privacy Preserving Data Publishing , 2007, VLDB.

[7]  Cynthia Dwork,et al.  Privacy, accuracy, and consistency too: a holistic solution to contingency table release , 2007, PODS.

[8]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[9]  Adam D. Smith,et al.  The price of privately releasing contingency tables and the spectra of random matrices with correlated rows , 2010, STOC '10.

[10]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[11]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[12]  Ilya Mironov,et al.  Differentially private recommender systems: building privacy into the net , 2009, KDD.

[13]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[14]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[15]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[16]  Tim Roughgarden,et al.  Universally utility-maximizing privacy mechanisms , 2008, STOC '09.

[17]  Daniel Kifer,et al.  Attacks on privacy and deFinetti's theorem , 2009, SIGMOD Conference.

[18]  Ninghui Li,et al.  Minimizing minimality and maximizing utility , 2010, Proc. VLDB Endow..