Fixed-Parameter Tractability of Anonymizing Data by Suppressing Entries

A popular model for protecting privacy when person-specific data is released is k-anonymity. A dataset is k-anonymous if each record is identical to at least (k? 1) other records in the dataset. The basic k-anonymization problem, which minimizes the number of dataset entries that must be suppressed to achieve k-anonymity, is NP-hard and hence not solvable both quickly and optimally in general. We apply parameterized complexity analysis to explore algorithmic options for restricted versions of this problem that occur in practice. We present the first fixed-parameter algorithms for this problem and identify key techniques that can be applied to this and other k-anonymization problems.

[1]  Henning Fernau,et al.  Complexity of a {0, 1}-matrix problem , 2004, Australas. J Comb..

[2]  Ljiljana Brankovic,et al.  PRIVACY ISSUES IN KNOWLEDGE DISCOVERY AND DATA MINING , 2000 .

[3]  Michael R. Fellows,et al.  Systematic parameterized complexity analysis in computational phonology , 1999 .

[4]  Philip S. Yu,et al.  Bottom-up generalization: a data mining solution to privacy protection , 2004, Fourth IEEE International Conference on Data Mining (ICDM'04).

[5]  Adam Meyerson,et al.  On the complexity of optimal K-anonymity , 2004, PODS.

[6]  Rajeev Motwani,et al.  Approximation Algorithms for k-Anonymity , 2005 .

[7]  Rajeev Motwani,et al.  Anonymizing Tables , 2005, ICDT.

[8]  Rolf Niedermeier,et al.  Invitation to Fixed-Parameter Algorithms , 2006 .

[9]  Graham Wrightson,et al.  Usability of compromise-free statistical databases , 1997, Proceedings. Ninth International Conference on Scientific and Statistical Database Management (Cat. No.97TB100150).

[10]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[11]  Paola Bonizzoni,et al.  Anonymizing Binary Tables is APX-hard , 2007, ArXiv.

[12]  Michael R. Fellows,et al.  Parameterized Complexity , 1998 .

[13]  M. C. Er,et al.  A Fast Algorithm for Generating Set Partitions , 1988, Comput. J..

[14]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[15]  Md Zahidul Islam,et al.  A Framework for Privacy Preserving Classification in Data Mining , 2004, ACSW.

[16]  Mirka Miller,et al.  A Combinatorial Problem in Database Security , 1999, Discret. Appl. Math..

[17]  Rhonda Chaytor Allowing privacy protection algorithms to jump out of local optimums: an ordered greed framework , 2007, KDD 2007.