论文信息 - The effect of homogeneity on the computational complexity of combinatorial data anonymization

The effect of homogeneity on the computational complexity of combinatorial data anonymization

A matrix M is said to be k-anonymous if for each row r in M there are at least k − 1 other rows in M which are identical to r. The NP-hard k-Anonymity problem asks, given an n × m-matrix M over a fixed alphabet and an integer s > 0, whether M can be made k-anonymous by suppressing (blanking out) at most s entries. Complementing previous work, we introduce two new “data-driven” parameterizations for k-Anonymity—the number tin of different input rows and the number tout of different output rows—both modeling aspects of data homogeneity. We show that k-Anonymity is fixed-parameter tractable for the parameter tin, and that it is NP-hard even for tout = 2 and alphabet size four. Notably, our fixed-parameter tractability result implies that k-Anonymity can be solved in linear time when tin is a constant. Our computational hardness results also extend to the related privacy problems p-Sensitivity and ℓ-Diversity, while our fixed-parameter tractability results extend to p-Sensitivity and the usage of domain generalization hierarchies, where the entries are replaced by more general data instead of being completely suppressed.

[1] Rolf Niedermeier,et al. Invitation to Fixed-Parameter Algorithms , 2006 .

[2] Paola Bonizzoni,et al. Parameterized complexity of k-anonymity: hardness and tractability , 2009, Journal of Combinatorial Optimization.

[3] Yogish Sabharwal,et al. On the Complexity of the $k$-Anonymization Problem , 2010, ArXiv.

[4] Yufei Tao,et al. Anatomy: simple and effective privacy preservation , 2006, VLDB.

[5] Jörg Flum,et al. Parameterized Complexity Theory (Texts in Theoretical Computer Science. An EATCS Series) , 2006 .

[6] Wu Meng. Protecting Location Privacy with Personalized k-anonymity , 2012 .

[7] Fabrizio Grandoni,et al. Resilient dictionaries , 2009, TALG.

[8] Riccardo Dondi,et al. The l-Diversity problem: Tractability and approximability , 2013, Theor. Comput. Sci..

[9] Ryan Williams,et al. Resolving the Complexity of Some Data Privacy Problems , 2010, ICALP.

[10] Yufei Tao,et al. The hardness and approximation algorithms for l-diversity , 2009, EDBT '10.

[11] Amin Milani Fard,et al. An effective clustering approach to web query log anonymization , 2010, 2010 International Conference on Security and Cryptography (SECRYPT).

[12] Michael R. Fellows,et al. Towards Fully Multivariate Algorithmics: Some New Results and Directions in Parameter Ecology , 2009, IWOCA.

[13] DworkCynthia. A firm foundation for private data analysis , 2011 .

[14] Samir Khuller,et al. Achieving anonymity via clustering , 2006, PODS '06.

[15] Traian Marius Truta,et al. Protection : p-Sensitive k-Anonymity Property , 2006 .

[16] Anna Monreale,et al. Movement data anonymity through generalization , 2009, SPRINGL '09.

[17] Rolf Niedermeier,et al. Pattern-Guided Data Anonymization and Clustering , 2011, MFCS.

[18] David S. Johnson,et al. The NP-Completeness Column: An Ongoing Guide , 1982, J. Algorithms.

[19] Edward Fredkin,et al. Trie memory , 1960, Commun. ACM.

[20] Jörg Flum,et al. Parameterized Complexity Theory , 2006, Texts in Theoretical Computer Science. An EATCS Series.

[21] Raymond Chi-Wing Wong,et al. (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[22] Pierangela Samarati,et al. Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[23] Philip S. Yu,et al. Privacy-preserving data publishing: A survey of recent developments , 2010, CSUR.

[24] Jian Pei,et al. The k-anonymity and l-diversity approaches for privacy preservation in social networks against neighborhood attacks , 2011, Knowledge and Information Systems.

[25] Rajeev Motwani,et al. Anonymizing Tables , 2005, ICDT.

[26] Christian Komusiewicz,et al. Deconstructing intractability - A multivariate complexity analysis of interval constrained coloring , 2011, J. Discrete Algorithms.

[27] Kyuseok Shim,et al. Approximate algorithms for K-anonymity , 2007, SIGMOD '07.

[28] Latanya Sweeney,et al. Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[29] James B. Orlin. A Faster Strongly Polynomial Minimum Cost Flow Algorithm , 1993, Oper. Res..

[30] Rolf Niedermeier,et al. Reflections on Multivariate Algorithmics and Problem Parameterization , 2010, STACS.

[31] Todd Wareham,et al. Fixed-parameter tractability of anonymizing data by suppressing entries , 2009, J. Comb. Optim..

[32] Tamir Tassa,et al. A practical approximation algorithm for optimal k-anonymity , 2011, Data Mining and Knowledge Discovery.

[33] Tamir Tassa,et al. k -Anonymization with Minimal Loss of Information , 2007, ESA.

[34] Ninghui Li,et al. t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[35] Latanya Sweeney,et al. k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[36] Guillermo Navarro-Arribas,et al. User k-anonymity for privacy preserving data mining of query logs , 2012, Inf. Process. Manag..

[37] Marco Gruteser,et al. USENIX Association , 1992 .

[38] Alina Campan,et al. Data and Structural k-Anonymity in Social Networks , 2009, PinKDD.

[39] ASHWIN MACHANAVAJJHALA,et al. L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[40] Panos Kalnis,et al. Providing K-Anonymity in location based services , 2010, SKDD.

[41] Michael R. Fellows,et al. Parameterized Complexity , 1998 .

[42] Ling Liu,et al. Protecting Location Privacy with Personalized k-Anonymity: Architecture and Algorithms , 2008, IEEE Transactions on Mobile Computing.

[43] Adam Meyerson,et al. On the complexity of optimal K-anonymity , 2004, PODS.

[44] Paola Bonizzoni,et al. Anonymizing binary and small tables is hard to approximate , 2011, J. Comb. Optim..

[45] Pierangela Samarati,et al. Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..