Adaptive Utility-based Anonymization Model: Performance Evaluation on Big Data Sets

Abstract Data Anonymization is one of the globally accepted mechanisms for the protection of privacy of individuals in data publishing scenario. Normally the data anonymization impacts on the quality of data especially critical to the success of knowledge-based applications. An intelligent approach based on association mining namely, Adaptive Utility-based Anonymization (AUA) has been proposed in order to deal with this issue. Initially the model is tested with sample instances of original data set National Family Health Survey (NFHS-3) and this paper includes performance evaluation of AUA model using data sets and proves that the data anonymization can be done without compromising the quality of data mining results.

[1]  Xiao-Bai Li,et al.  Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining , 2009, Decis. Support Syst..

[2]  Anitha S. Pillai,et al.  Disclosure risk of individuals: A k-anonymity study on health care data related to Indian population , 2014, 2014 International Conference on Data Science & Engineering (ICDSE).

[3]  Zengo Furukawa,et al.  A General Framework for , 1991 .

[4]  Weijia Yang,et al.  A novel anonymization algorithm: Privacy protection and knowledge preservation , 2010, Expert Syst. Appl..

[5]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[6]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[7]  Marina Blanton Achieving Full Security in Privacy-Preserving Data Mining , 2011, 2011 IEEE Third Int'l Conference on Privacy, Security, Risk and Trust and 2011 IEEE Third Int'l Conference on Social Computing.

[8]  Anitha S. Pillai,et al.  An intelligent framework for protecting privacy of individuals empirical evaluations on data mining classification , 2014, 2014 14th International Conference on Hybrid Intelligent Systems.

[9]  Ting Wang,et al.  A general framework for medical data mining , 2010, 2010 International Conference on Future Information Technology and Management Engineering.

[10]  Ninghui Li,et al.  Towards optimal k-anonymization , 2008, Data Knowl. Eng..

[11]  Li Liu,et al.  The applicability of the perturbation based privacy preserving data mining for real-world data , 2008, Data Knowl. Eng..

[12]  Hua Wang,et al.  Extended k-anonymity models against sensitive attribute disclosure , 2011, Comput. Commun..

[13]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[14]  David Sánchez,et al.  A semantic framework to protect the privacy of electronic health records with non-numerical attributes , 2013, J. Biomed. Informatics.

[15]  Ravi Mukkamala,et al.  Fuzzy-based Methods for Privacy-Preserving Data Mining , 2011, 2011 Eighth International Conference on Information Technology: New Generations.

[16]  Lin Peng,et al.  Study on K-anonymity Models of Sharing Medical Information , 2007, 2007 International Conference on Service Systems and Service Management.

[17]  Joshua Zhexue Huang,et al.  Rating: Privacy Preservation for Multiple Attributes with Different Sensitivity Requirements , 2011, 2011 IEEE 11th International Conference on Data Mining Workshops.

[18]  Yun Ding,et al.  Model-Driven Application-Level Encryption for the Privacy of E-health Data , 2010, 2010 International Conference on Availability, Reliability and Security.