Pacific Asia Conference on Information Systems ( PACIS ) 7-15-2012 μ-Fractal Based Data Perturbation Algorithm For Privacy Protection

Many organizations publish anonymous medical data for sociology research, health research, education and other useful studies. Although attributes that clearly identify individuals, such as name and certain personal identity numbers are removed, the combination of some other information, like the date of birth, gender, post-code etc. can still be used to identify an individual. Existing data perturbation techniques are able to de-identify the data prior to publishing, but they suffer from making the process irreversible, so that the original data cannot be fully recovered. How to maintain the usability and utility of privacy-protected data as well as make the published data restorable for authorized users is a major issue. In this paper, we propose a novel robust data perturbation algorithm that can withstand brute force attacks, while the perturbed data pattern is indistinguishable from the original data pattern. A distinguishing feature of our data perturbation method is that, using fractal theory to derive perturbation vectors, it provides high privacy protection together with fully reversible data perturbation while maintaining maximal data utility. Experiments based on practical data confirm the desired operation of our data perturbation algorithm and its effectiveness. The results obtained from our experiments leads us to conclude that the proposed approach is able to computationally resist brute-force attacks as well as maintain the same data distribution type as that of original data.

[1]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[2]  Yufei Tao,et al.  Anatomy: simple and effective privacy preservation , 2006, VLDB.

[3]  Yihua Zhang ON DATA UTILITY IN PRIVATE DATA PUBLISHING , 2010 .

[4]  Vinod Vaikuntanathan,et al.  Can homomorphic encryption be practical? , 2011, CCSW '11.

[5]  Lifang Gu,et al.  Privacy-preserving data linkage protocols , 2004, WPES '04.

[6]  Jian Pei,et al.  A Survey of Utility-based Privacy-Preserving Data Transformation Methods , 2008, Privacy-Preserving Data Mining.

[7]  Traian Marius Truta,et al.  Protection : p-Sensitive k-Anonymity Property , 2006 .

[8]  J. Eckmann,et al.  Iterated maps on the interval as dynamical systems , 1980 .

[9]  Mohammed Ketel Quantification of a Privacy Preserving Data Mining Transformation , 2006, DMIN.

[10]  David J. DeWitt,et al.  Incognito: efficient full-domain K-anonymity , 2005, SIGMOD '05.

[11]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[12]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[13]  Rogelio Hasimoto-Beltran Low-complexity chaotic encryption system , 2007 .

[14]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[15]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[16]  Yufei Tao,et al.  Personalized privacy preservation , 2006, Privacy-Preserving Data Mining.

[17]  Qishan Zhang,et al.  A privacy preserving clustering technique using hybrid data transformation method , 2009, 2009 IEEE International Conference on Grey Systems and Intelligent Services (GSIS 2009).

[18]  E. Poovammal,et al.  Task Independent Privacy Preserving Data Mining on Medical Dataset , 2009, 2009 International Conference on Advances in Computing, Control, and Telecommunication Technologies.

[19]  Yufei Tao,et al.  M-invariance: towards privacy preserving re-publication of dynamic datasets , 2007, SIGMOD '07.

[20]  David J. DeWitt,et al.  Mondrian Multidimensional K-Anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[21]  Douglas M. Blough,et al.  A robust data obfuscation approach for privacy preserving collaborative filtering , 2006 .