Privacy-Preserving Classification Rule Mining for Balancing Data Utility and Knowledge Privacy Using Adapted Binary Firefly Algorithm

Privacy-preserving data mining is an embryonic research area that addresses the integration of privacy-preserving concerns to data mining techniques. Classification is a problem in data mining which builds a model to classify the data and then identify the class label of unknown data based on the constructed model. Large amount of data is necessary to build a more accurate classifier. Sharing of data is one of the solutions to have enormous amount of data. When sharing the data among business associates, some sensitive patterns which can be derived from the data need not be revealed to the others. This situation raises a motivating issue of retaining the shared data with high quality by hiding some sensitive patterns. This paper addresses the problem of classification rule hiding by projecting a novel method based on data distortion approach. To select the best possible way of altering the instances and then selecting the optimal instances which reduces the loss of non-sensitive classification rules, a computational intelligence technique binary firefly algorithm is adapted with necessary changes. The transformed data set will be shared to the others which reveals only non-sensitive knowledge. A set of experiments were carried out to estimate the effectiveness of the proposed approach against existing similar ones by considering the performance measures miss cost, artifacts and deviation between original and transformed data sets. The experiments and comparisons have proved that the projected method preserves the privacy of sensitive classification rules as well as maintains quality of the transformed data set also.

[1]  Tamir Tassa,et al.  Secure Mining of Association Rules in Horizontally Distributed Databases , 2011, IEEE Transactions on Knowledge and Data Engineering.

[2]  Xiaolin Zhang,et al.  Research on privacy preserving classification data mining based on random perturbation , 2010, 2010 International Conference on Information, Networking and Automation (ICINA).

[3]  Motohide Umano,et al.  Privacy preserving extraction of fuzzy rules from distributed data , 2013, 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[4]  Doryaneh Hossien Afshari,et al.  Using blocking approach to preserve privacy in classification rules by inserting dummy Transaction , 2017 .

[5]  Fakhri Karray,et al.  Flocking based approach for data clustering , 2010, Natural Computing.

[6]  J. Gyani,et al.  Privacy preserving associative classification on vertically partitioned databases , 2012, 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT).

[7]  Philip S. Yu,et al.  Template-based privacy preservation in classification problems , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[8]  Xue Li,et al.  A Heuristic Data Reduction Approach for Associative Classification Rule Hiding , 2008, PRICAI.

[9]  Dhiren R. Patel,et al.  Blocking Based Approach for Classification Rule Hiding to Preserve the Privacy in Database , 2011, 2011 International Symposium on Computer Science and Society.

[10]  Aryya Gangopadhyay,et al.  A privacy preserving technique for distance-based classification with worst case privacy guarantees , 2008, Data Knowl. Eng..

[11]  Piotr Andruszkiewicz Reduction Relaxation in Privacy Preserving Association Rules Mining , 2012, ADBIS.

[12]  Cuneyt Yavuz,et al.  Head loss estimation for water jets from flip buckets , 2016 .

[13]  Paulo Cortez,et al.  A data-driven approach to predict the success of bank telemarketing , 2014, Decis. Support Syst..

[14]  Durga Toshniwal,et al.  Scalable two-phase co-occurring sensitive pattern hiding using MapReduce , 2017, Journal of Big Data.

[15]  M. Dhanalakshmi,et al.  Privacy preserving data mining techniques-survey , 2014, International Conference on Information Communication and Embedded Systems (ICICES2014).

[16]  Xinjun Qi,et al.  An Overview of Privacy Preserving Data Mining , 2012 .

[17]  Chabane Djeraba,et al.  Mathematical Tools for Data Mining: Set Theory, Partial Orders, Combinatorics , 2008, Advanced Information and Knowledge Processing.

[18]  Brijesh Kumar Chaurasia,et al.  Hiding Sensitive Association Rules without Altering the Support of Sensitive Item(s) , 2012 .

[19]  Maria E. Orlowska,et al.  Hiding Classification Rules for Data Sharing with Privacy Preservation , 2005, DaWaK.

[20]  Manish Sharma,et al.  An efficient approach for privacy preserving in data mining , 2014, 2014 International Conference on Signal Propagation and Computer Technology (ICSPCT 2014).

[21]  Mo Yuan-bin,et al.  Optimal Choice of Parameters for Firefly Algorithm , 2013, 2013 Fourth International Conference on Digital Manufacturing & Automation.

[22]  M. B. Malik,et al.  Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects , 2012, 2012 Third International Conference on Computer and Communication Technology.

[23]  Maria E. Orlowska,et al.  A reconstruction-based algorithm for classification rules hiding , 2006, ADC.

[24]  Aris Gkoulalas-Divanis,et al.  Reconstruction-based Classification Rule Hiding through Controlled Data Modification , 2009, AIAI.

[25]  Jens H. Weber,et al.  Privacy Preserving Decision Tree Learning Using Unrealized Data Sets , 2012, IEEE Transactions on Knowledge and Data Engineering.

[26]  Ali Amiri,et al.  Dare to share: Protecting sensitive knowledge with data sanitization , 2007, Decis. Support Syst..

[27]  Durvasula V. L. N. Somayajulu,et al.  A Noise Addition Scheme in Decision Tree for Privacy Preserving Data Mining , 2010, ArXiv.

[28]  Chunxiao Jiang,et al.  Information Security in Big Data: Privacy and Data Mining , 2014, IEEE Access.

[29]  Elisa Bertino,et al.  Privacy-Preserving Association Rule Mining in Cloud Computing , 2015, AsiaCCS.

[30]  Vassilios S. Verykios,et al.  A data perturbation approach to sensitive classification rule hiding , 2010, SAC '10.

[31]  Dimitrios Kalles,et al.  Hiding decision tree rules by data set operations , 2015, 2015 6th International Conference on Information, Intelligence, Systems and Applications (IISA).

[32]  Aruna Tiwari,et al.  Privacy-Preserving Data Sharing Using Data Reconstruction Based Approach , 2012 .