A utility preserving data-oriented anonymization method based on data ordering

Due to recent advances, data collection and publishing for scientific purposes are made by some organizations. Published data should be anonymized such that being useful while privacy of data respondents are preserved. So, there is a trade-off between data utility and privacy. Microaggregation is a popular family of anonymization methods that operates on numerical data. In this paper, we propose a microaggregation algorithm called NFPN_MHM that first sorts data in a spiral shape, next it finds a partitioning with the lowest utility loss with respect to the sorted data. Experimental results show that the proposed method attains lower information loss than traditional microaggregation methods and provides a better trade-off between data utility and privacy, especially for scattered data.

[1]  Georgios Tziritas,et al.  Successive Group Selection for Microaggregation , 2013, IEEE Transactions on Knowledge and Data Engineering.

[2]  A. Solanas,et al.  V-MDAV : A Multivariate Microaggregation With Variable Group Size , 2006 .

[3]  Josep Domingo-Ferrer,et al.  Practical Data-Oriented Microaggregation for Statistical Disclosure Control , 2002, IEEE Trans. Knowl. Data Eng..

[4]  Jordi Nin,et al.  Efficient microaggregation techniques for large numerical data volumes , 2012, International Journal of Information Security.

[5]  Panos Kalnis,et al.  Fast Data Anonymization with Low Information Loss , 2007, VLDB.

[6]  Josep Domingo-Ferrer,et al.  Ordinal, Continuous and Heterogeneous k-Anonymity Through Microaggregation , 2005, Data Mining and Knowledge Discovery.

[7]  B. John Oommen,et al.  Achieving Microaggregation for Secure Statistical Databases Using Fixed-Structure Partitioning-Based Learning Automata , 2009, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[8]  Saeed Jalili,et al.  Multivariate microaggregation by iterative optimization , 2013, Applied Intelligence.

[9]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[10]  Sumitra Mukherjee,et al.  A Polynomial Algorithm for Optimal Univariate Microaggregation , 2003, IEEE Trans. Knowl. Data Eng..

[11]  Michael J. Laszlo,et al.  Minimum spanning tree partitioning algorithm for microaggregation , 2005, IEEE Transactions on Knowledge and Data Engineering.

[12]  Josep Domingo-Ferrer,et al.  On the complexity of optimal microaggregation for statistical disclosure control , 2001 .

[13]  Josep Domingo-Ferrer,et al.  Efficient multivariate data-oriented microaggregation , 2006, The VLDB Journal.