A New Method of Privacy Protection: Random k-Anonymous

A new k-anonymous method which is different from traditional k-anonymous was proposed to solve the problem of privacy protection. Specifically, numerical data achieves k-anonymous by adding noises, and categorical data achieves k-anonymous by using randomization. Using the above two methods, the drawback that at least k elements must have the same quasi identifier in the k-anonymous data set has been solved. Since the process of finding anonymous equivalence is very time consuming, a two-step clustering method is used to divide the original data set into equivalence classes. First, the original data set is divided into several different sub-datasets, and then the equivalence classes are formed in the sub-datasets, thus greatly reducing the computational cost of finding anonymous equivalence classes. The experiments are conducted on three different data sets, and the results show that the proposed method is more efficient and the information loss of anonymous dataset is much smaller.

[1]  Fang Liu,et al.  Generalized Gaussian Mechanism for Differential Privacy , 2016, IEEE Transactions on Knowledge and Data Engineering.

[2]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[3]  Ninghui Li,et al.  On sampling, anonymization, and differential privacy or, k-anonymization meets differential privacy , 2011, ASIACCS '12.

[4]  Josep Domingo-Ferrer,et al.  From t-closeness to differential privacy and vice versa in data anonymization , 2015, Knowl. Based Syst..

[5]  Joshua Zhexue Huang,et al.  Extensions to the k-Means Algorithm for Clustering Large Data Sets with Categorical Values , 1998, Data Mining and Knowledge Discovery.

[6]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[7]  Traian Marius Truta,et al.  Protection : p-Sensitive k-Anonymity Property , 2006 .

[8]  Tinghuai Ma,et al.  Deep rolling: A novel emotion prediction model for a multi-participant communication context , 2019, Inf. Sci..

[9]  Josep Domingo-Ferrer,et al.  Optimal data-independent noise for differential privacy , 2013, Inf. Sci..

[10]  I-Cheng Yeh,et al.  The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients , 2009, Expert Syst. Appl..

[11]  Tinghuai Ma,et al.  Graph classification based on graph set reconstruction and graph kernel feature reduction , 2018, Neurocomputing.

[12]  Charu C. Aggarwal,et al.  On k-Anonymity and the Curse of Dimensionality , 2005, VLDB.

[13]  Huawen Liu,et al.  MAGE: A semantics retaining K-anonymization method for mixed data , 2014, Knowl. Based Syst..

[14]  Yuan Tian,et al.  Protection of location privacy for moving kNN queries in social networks , 2017, Appl. Soft Comput..

[15]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[16]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[17]  Carla E. Brodley,et al.  Feature Selection for Unsupervised Learning , 2004, J. Mach. Learn. Res..

[18]  Shunxiang Zhang,et al.  An enhanced l-diversity privacy preservation , 2013, 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[19]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[20]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[21]  Tinghuai Ma,et al.  Natural disaster topic extraction in Sina microblogging based on graph analysis , 2019, Expert Syst. Appl..

[22]  Pramod Viswanath,et al.  The Optimal Noise-Adding Mechanism in Differential Privacy , 2012, IEEE Transactions on Information Theory.

[23]  Pierangela Samarati,et al.  Generalizing Data to Provide Anonymity when Disclosing Information , 1998, PODS 1998.

[24]  Liehuang Zhu,et al.  Achieving differential privacy of trajectory data publishing in participatory sensing , 2017, Inf. Sci..

[25]  Jian Shen,et al.  $$\varvec{\textit{KDVEM}}$$KDVEM: a $$k$$k-degree anonymity with vertex and edge modification algorithm , 2015, Computing.

[26]  L Sweeney,et al.  Weaving Technology and Policy Together to Maintain Confidentiality , 1997, Journal of Law, Medicine & Ethics.

[27]  Laurence T. Yang,et al.  Secure weighted possibilistic c-means algorithm on cloud for clustering big data , 2018, Inf. Sci..

[28]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[29]  Guy N. Rothblum,et al.  Concentrated Differential Privacy , 2016, ArXiv.

[30]  Tinghuai Ma,et al.  An efficient and scalable density-based clustering algorithm for datasets with complex structures , 2016, Neurocomputing.

[31]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[32]  Andreas Haeberlen,et al.  Differential Privacy: An Economic Method for Choosing Epsilon , 2014, 2014 IEEE 27th Computer Security Foundations Symposium.