Evaluating privacy threats in released database views by symmetric indistinguishability

A privacy violation occurs when the association between an individual identity and data considered private by that individual is obtained by an unauthorized party. Uncertainty and indistinguishability are two independent aspects that characterize the degree of this association being revealed. Indistinguishability refers to the property that the attacker cannot see the difference among a group of individuals, while uncertainty refers to the property that the attacker cannot tell which private value, among a group of values, an individual actually has. This paper investigates the notion of indistinguishability as a general form of anonymity, applicable, for example, not only to generalized private tables, but to relational views and to sets of views obtained by multiple queries over a private database table. It is shown how indistinguishability is highly influenced by certain symmetries among individuals, in the released data, with respect to their private values. The paper provides both theoretical results and practical algorithms for checking if a specific set of views over a private table provide sufficient indistinguishability.

[1]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[2]  Sushil Jajodia,et al.  Indistinguishability: The Other Aspect of Privacy , 2006, Secure Data Management.

[3]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[4]  Alexandre V. Evfimievski,et al.  Limiting privacy breaches in privacy preserving data mining , 2003, PODS.

[5]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[6]  Jon M. Kleinberg,et al.  Auditing Boolean attributes , 2003, J. Comput. Syst. Sci..

[7]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[8]  Sushil Jajodia,et al.  Secure Databases: Constraints, Inference Channels, and Monitoring Disclosures , 2000, IEEE Trans. Knowl. Data Eng..

[9]  John E. Mitchell,et al.  The Clique Partition Problem with Minimum Clique Size Requirement 1 , 2005 .

[10]  R. Gavison Privacy and the Limits of Law , 1980 .

[11]  Hoeteck Wee,et al.  Toward Privacy in Public Databases , 2005, TCC.

[12]  Sushil Jajodia,et al.  Cardinality-Based Inference Control in Sum-Only Data Cubes , 2002, ESORICS.

[13]  D.G. Marks,et al.  Inference in MLS Database Systems , 1996, IEEE Trans. Knowl. Data Eng..

[14]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[15]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[16]  Nabil R. Adam,et al.  Security-control methods for statistical databases: a comparative study , 1989, CSUR.

[17]  Elisa Bertino,et al.  Secure Anonymization for Incremental Datasets , 2006, Secure Data Management.

[18]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[19]  Richard M. Karp,et al.  A n^5/2 Algorithm for Maximum Matchings in Bipartite Graphs , 1971, SWAT.

[20]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[21]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[22]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[23]  Nina Mishra,et al.  Simulatable auditing , 2005, PODS.

[24]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[25]  Ramakrishnan Srikant,et al.  Privacy-preserving data mining , 2000, SIGMOD '00.

[26]  Jayant R. Haritsa,et al.  A Framework for High-Accuracy Privacy-Preserving Mining , 2005, ICDE.

[27]  Dan Suciu,et al.  A formal analysis of information disclosure in data exchange , 2007, J. Comput. Syst. Sci..

[28]  Alin Deutsch,et al.  Privacy in Database Publishing , 2005, ICDT.

[29]  Sushil Jajodia,et al.  Checking for k-Anonymity Violation by Views , 2005, VLDB.

[30]  Alberto O. Mendelzon,et al.  Authorization Views and Conditional Query Containment , 2005, ICDT.

[31]  Harry S. Delugach,et al.  Wizard: A Database Inference Analysis and Detection System , 1996, IEEE Trans. Knowl. Data Eng..

[32]  Sujeet Shenoi,et al.  Catalytic inference analysis: detecting inference threats due to knowledge discovery , 1997, Proceedings. 1997 IEEE Symposium on Security and Privacy (Cat. No.97CB36097).

[33]  Rathindra Sarathy,et al.  Security of random data perturbation methods , 1999, TODS.

[34]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[35]  Roberto J. Bayardo,et al.  Data privacy through optimal k-anonymization , 2005, 21st International Conference on Data Engineering (ICDE'05).

[36]  Qing Zhang,et al.  Aggregate Query Answering on Anonymized Tables , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[37]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.