A comparative study on innovative approaches for privacy-preservation in knowledge discovery

Confronting with growing size of data and pressure of extracting useful knowledge in different manners made privacy preserving a crucial subject. This major is even more important especially in big data environment that implements knowledge discovery and data mining for producing beneficial information. Beside the inner importance aspect of privacy of personal data, the efficiency of the approaches of preserving privacy is a special factor. This is because of the overheads that injected by privacy preserving methods in decreasing the accuracy of end results of data mining. Soft computing is a general name of a group of logic based methods that have several usages. Its recent usage is in privacy preserving in big data. In this paper, a comprehensive survey of different regular methods of privacy preserving for KDD and Data Mining presented and then reasons of why soft computing methods can be a substitute for Privacy Preserving in these environments are discussed. Beside the analysis and discussion of merit and shortcomings of approaches, a conceptual framework for state of the art of privacy-preserving represented and provides research gaps and future works.

[1]  H. H. Kassarjian Content Analysis in Consumer Research , 1977 .

[2]  B Downe-Wamboldt,et al.  Content analysis: method, applications, and issues. , 1992, Health care for women international.

[3]  Lotfi A. Zadeh,et al.  Fuzzy logic, neural networks, and soft computing , 1993, CACM.

[4]  Pierangela Samarati,et al.  Protecting privacy when disclosing information: k-anonymity and its enforcement through generalization and suppression , 1998 .

[5]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[6]  Rakesh Agrawal,et al.  Privacy-preserving data mining , 2000, SIGMOD 2000.

[7]  Pierangela Samarati,et al.  Protecting Respondents' Identities in Microdata Release , 2001, IEEE Trans. Knowl. Data Eng..

[8]  Elisa Bertino,et al.  Hiding Association Rules by Using Confidence and Support , 2001, Information Hiding.

[9]  Larry Korba,et al.  Privacy in distributed electronic commerce , 2002, Proceedings of the 35th Annual Hawaii International Conference on System Sciences.

[10]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[11]  Qi Wang,et al.  Random-data perturbation techniques and privacy-preserving data mining , 2005, Knowledge and Information Systems.

[12]  Chris Clifton,et al.  When do data mining results violate privacy? , 2004, KDD.

[13]  Elisa Bertino,et al.  State-of-the-art in privacy preserving data mining , 2004, SGMD.

[14]  Elisa Bertino,et al.  Association rule hiding , 2004, IEEE Transactions on Knowledge and Data Engineering.

[15]  Sheng Zhong,et al.  Privacy-Preserving Classification of Customer Data without Loss of Accuracy , 2005, SDM.

[16]  Rajeev Motwani,et al.  Anonymizing Tables , 2005, ICDT.

[17]  David E. Goldberg,et al.  Genetic algorithms and Machine Learning , 1988, Machine Learning.

[18]  Wei Zhao,et al.  A new scheme on privacy-preserving data classification , 2005, KDD '05.

[19]  Philip S. Yu,et al.  Top-down specialization for information and privacy preservation , 2005, 21st International Conference on Data Engineering (ICDE'05).

[20]  Hsinchun Chen,et al.  Intelligence and security informatics: information systems perspective , 2006, Decis. Support Syst..

[21]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[22]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[23]  Raymond Chi-Wing Wong,et al.  (α, k)-anonymity: an enhanced k-anonymity model for privacy preserving data publishing , 2006, KDD '06.

[24]  Marilyn Domas White,et al.  Content Analysis: A Flexible Methodology , 2006, Libr. Trends.

[25]  Jaideep Vaidya,et al.  Privacy-Preserving SVM Classification on Vertically Partitioned Data , 2006, PAKDD.

[26]  Philip S. Yu,et al.  Anonymizing Classification Data for Privacy Preservation , 2007, IEEE Transactions on Knowledge and Data Engineering.

[27]  Lior Rokach,et al.  Soft Computing for Knowledge Discovery and Data Mining , 2007 .

[28]  Arbee L. P. Chen,et al.  Hiding Sensitive Association Rules with Limited Side Effects , 2007 .

[29]  Philip S. Yu,et al.  A General Survey of Privacy-Preserving Data Mining Models and Algorithms , 2008, Privacy-Preserving Data Mining.

[30]  George V. Moustakides,et al.  A MaxMin approach for hiding frequent itemsets , 2008, Data Knowl. Eng..

[31]  Li Liu,et al.  The applicability of the perturbation based privacy preserving data mining for real-world data , 2008, Data Knowl. Eng..

[32]  Aris Gkoulalas-Divanis,et al.  A Survey of Association Rule Hiding Methods for Privacy , 2008, Privacy-Preserving Data Mining.

[33]  Jun-Lin Lin,et al.  A Hybrid Method for k-Anonymization , 2008, 2008 IEEE Asia-Pacific Services Computing Conference.

[34]  V. Valli Kumari,et al.  Fuzzy based approach for privacy preserving publication of data , 2008 .

[35]  Sadaaki Miyamoto,et al.  On intuitionistic fuzzy clustering for its application to privacy , 2008, 2008 IEEE International Conference on Fuzzy Systems (IEEE World Congress on Computational Intelligence).

[36]  Durvasula V. L. N. Somayajulu,et al.  A Data Perturbation Method by Field Rotation and Binning by Averages Strategy for Privacy Preservation , 2008, IDEAL.

[37]  Josep Domingo-Ferrer,et al.  A Critique of k-Anonymity and Some of Its Enhancements , 2008, 2008 Third International Conference on Availability, Reliability and Security.

[38]  Ahmad Khademzadeh,et al.  A Novel Method for Privacy Preserving in Association Rule Mining Based on Genetic Algorithms , 2009, J. Softw..

[39]  E. Poovammal,et al.  Preserving Micro Data Release: Categorical and Numerical Data , 2009 .

[40]  Xiao-Bai Li,et al.  Identity disclosure protection: A data reconstruction approach for privacy-preserving data mining , 2009, Decis. Support Syst..

[41]  Sowndarya Performance Analysis of Clustering Algorithms in Detecting Outliers , 2010 .

[42]  Lior Rokach,et al.  Privacy-preserving data mining: A feature set partitioning approach , 2010, Inf. Sci..

[43]  M. Manasa,et al.  Hybrid Algorithm for Privacy Preserving Association Rule Mining , 2010 .

[44]  Tzung-Pei Hong,et al.  Evolutionary privacy-preserving data mining , 2010, 2010 World Automation Congress.

[45]  Vassilios S. Verykios,et al.  A data perturbation approach to sensitive classification rule hiding , 2010, SAC '10.

[46]  Keke Chen,et al.  Under Consideration for Publication in Knowledge and Information Systems Geometric Data Perturbation for Privacy Preserving Outsourced Data Mining , 2010 .

[47]  Jing Yang,et al.  Research on Privacy Protection Based on K-Anonymity , 2010, 2010 International Conference on Biomedical Engineering and Computer Science.

[48]  Saurabh Gupta,et al.  Classification of ignition regimes in HCCI combustion using computational singular perturbation , 2011 .

[49]  Pradeep Kumar,et al.  Fuzzy based clustering algorithm for privacy preserving data mining , 2011, Int. J. Bus. Inf. Syst..

[50]  Ravi Mukkamala,et al.  Fuzzy-based Methods for Privacy-Preserving Data Mining , 2011, 2011 Eighth International Conference on Information Technology: New Generations.

[51]  M. B. Malik,et al.  Privacy Preserving Data Mining Techniques: Current Scenario and Future Prospects , 2012, 2012 Third International Conference on Computer and Communication Technology.

[52]  Ali Miri,et al.  Privacy-preserving back-propagation and extreme learning machine algorithms , 2012, Data Knowl. Eng..

[53]  Animesh Tripathy,et al.  A classification based framework for privacy preserving data mining , 2012, ICACCI '12.

[54]  Hidetomo Ichihashi,et al.  A fuzzy variant of k-member clustering for collaborative filtering with data anonymization , 2012, 2012 IEEE International Conference on Fuzzy Systems.

[55]  J. Gyani,et al.  Privacy preserving associative classification on vertically partitioned databases , 2012, 2012 IEEE International Conference on Advanced Communication Control and Computing Technologies (ICACCCT).

[56]  Brijesh Kumar Chaurasia,et al.  Hiding Sensitive Association Rules without Altering the Support of Sensitive Item(s) , 2012 .

[57]  Tianqing Zhu,et al.  An Anonymization Method Based on Tradeoff between Utility and Privacy for Data Publishing , 2012, 2012 International Conference on Management of e-Commerce and e-Government.

[58]  L. Ragha,et al.  Privacy Preserving in Data Mining Using Hybrid Approach , 2012, 2012 Fourth International Conference on Computational Intelligence and Communication Networks.

[59]  Sridhar Mandapati,et al.  A Hybrid Algorithm for Privacy Preserving in Data Mining , 2013 .

[60]  Hui Wang Quality Measurements for Association Rules Hiding , 2013 .

[61]  Katsuhiro Honda,et al.  A study on applicability of fuzzy k-member clustering to privacy preserving pattern recognition , 2013, 2013 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE).

[62]  Durga Toshniwal,et al.  Privacy preserving association rule mining over distributed databases using genetic algorithm , 2013, Neural Computing and Applications.

[63]  C. V. Guru Rao,et al.  A Comparative Study of Data Perturbation Using Fuzzy Logic to Preserve Privacy , 2014 .

[64]  Chunxiao Jiang,et al.  Information Security in Big Data: Privacy and Data Mining , 2014, IEEE Access.

[65]  nbspKiran Patel,et al.  Privacy Preserving in Data stream classification using different proposed Perturbation Methods , 2014 .

[66]  Sehwa Park,et al.  A Novel Privacy Preserving Association Rule Mining using Hadoop , 2014 .

[67]  Syed,et al.  A FUZZY BASED APPROACH FOR PRIVACY PRESERVING CLUSTERING , 2014 .

[68]  Venkata Naresh Mandhala,et al.  An Association Rule hiding Algorithm for Privacy Preserving Data Mining , 2014 .

[69]  Hayden Wimmer,et al.  A Comparison of the Effects of K-Anonymity on Machine Learning Algorithms , 2014 .

[70]  Rashid Ali,et al.  A model for privacy preserving in data mining using Soft Computing techniques , 2015, 2015 2nd International Conference on Computing for Sustainable Global Development (INDIACom).

[71]  V. Kavitha,et al.  Geometric Data Perturbation-Based Personal Health Record Transactions in Cloud Computing , 2015, TheScientificWorldJournal.

[72]  Johannes Schneider,et al.  On Data Publishing with Clustering Preservation , 2015, ACM Trans. Knowl. Discov. Data.

[73]  Mohammad Abdur Razzaque,et al.  A comprehensive review on privacy preserving data mining , 2015, SpringerPlus.

[74]  Yuguang Fang,et al.  Privacy-Preserving Data Classification and Similarity Evaluation for Distributed Systems , 2016, 2016 IEEE 36th International Conference on Distributed Computing Systems (ICDCS).

[75]  Ravi Gulati,et al.  Privacy-Leveled Perturbation Model for Privacy Preserving Collaborative Data Mining , 2016 .

[76]  Ravi Gulati,et al.  Evaluating applicability of perturbation techniques for privacy preserving data mining by descriptive statistics , 2016, 2016 International Conference on Advances in Computing, Communications and Informatics (ICACCI).

[77]  Mohammadi Shahriar,et al.  PRIVACY PRESERVING BIG DATA MINING: ASSOCIATION RULE HIDING , 2016 .

[78]  C. V. Guru Rao,et al.  Multiplicative Data Perturbation Using Fuzzy Logic in Preserving Privacy , 2016, ICTCS.

[79]  Mamta Narwaria,et al.  Privacy preserving data mining — ‘A state of the art’ , 2016, 2016 3rd International Conference on Computing for Sustainable Global Development (INDIACom).

[80]  Rashid Ali,et al.  Preserving Privacy and Optimizing Neural Network Classification by using a Mix of Soft Computing Techniques , 2016 .

[81]  Ruchuan Wang,et al.  Efficient privacy-preserving classification construction model with differential privacy technology , 2017 .

[82]  Shashank Pushkar,et al.  Fuzzy-Based Privacy Preserving Approach in Centralized Database Environment , 2017 .

[83]  Geoffrey I. Webb,et al.  Advances in Knowledge Discovery and Data Mining , 2018, Lecture Notes in Computer Science.