Evaluating applicability of perturbation techniques for privacy preserving data mining by descriptive statistics

Extensive research has been carried out for preserving the privacy of identifiers in dataset during Data Mining. Various dimensions based on Cryptographic principles, Perturbation and Secure Sum Computation have been studied to achieve privacy. Effective techniques to maximize privacy and minimize information loss have always been intriguing. The work in this paper presents a comparison based on experimental study of three fundamental perturbation techniques viz. - Additive, Multiplicative and Geometric Data Perturbation [GDP] for Privacy Preserving Data Mining [PPDM]. These techniques form the basis of many advanced Perturbation techniques as described later. The literature doesn't embark a clear cut comparison amongst the three techniques based on suitable metrics. We have identified various statistical metrics that must be considered for evaluating Perturbation techniques. The facet of research is independent in this context, and this paper will try to confer the applicability of perturbation techniques by descriptive statistics through experiments under one roof. A comparison amongst the perturbation based techniques is conferred at the end to exemplify the importance of this research.

[1]  Charu C. Aggarwal,et al.  On the design and quantification of privacy preserving data mining algorithms , 2001, PODS.

[2]  Gu Si-yang,et al.  Privacy preserving association rule mining in vertically partitioned data , 2006 .

[3]  Li Liu,et al.  Privacy Preserving Decision Tree Mining from Perturbed Data , 2009, 2009 42nd Hawaii International Conference on System Sciences.

[4]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[5]  Yonglong Luo,et al.  Three New Approaches to Privacy-preserving Add to Multiply Protocol and its Application , 2009, 2009 Second International Workshop on Knowledge Discovery and Data Mining.

[6]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[7]  Ashwin Machanavajjhala,et al.  l-Diversity: Privacy Beyond k-Anonymity , 2006, ICDE.

[8]  Benny Pinkas,et al.  Cryptographic techniques for privacy-preserving data mining , 2002, SKDD.

[9]  Alpa Shah State-of-art in Statistical Anonymization Techniques for Privacy Preserving Data Mining , 2012 .

[10]  Keke Chen,et al.  Privacy-Preserving Multiparty Collaborative Mining with Geometric Data Perturbation , 2009, IEEE Transactions on Parallel and Distributed Systems.

[11]  Li Liu,et al.  The applicability of the perturbation based privacy preserving data mining for real-world data , 2008, Data Knowl. Eng..

[12]  Tung-Shou Chen,et al.  Reversible privacy preserving data mining: a combination of difference expansion and privacy preserving , 2013, The Journal of Supercomputing.

[13]  ASHWIN MACHANAVAJJHALA,et al.  L-diversity: privacy beyond k-anonymity , 2006, 22nd International Conference on Data Engineering (ICDE'06).

[14]  Qi Wang,et al.  On the privacy preserving properties of random data perturbation techniques , 2003, Third IEEE International Conference on Data Mining.

[15]  Moni Naor,et al.  Efficient oblivious transfer protocols , 2001, SODA '01.

[16]  Kun Liu,et al.  Random projection-based multiplicative data perturbation for privacy preserving distributed data mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[17]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[18]  Latanya Sweeney,et al.  Achieving k-Anonymity Privacy Protection Using Generalization and Suppression , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[19]  Keke Chen,et al.  Privacy preserving data classification with rotation perturbation , 2005, Fifth IEEE International Conference on Data Mining (ICDM'05).

[20]  Kun Liu,et al.  An Attacker's View of Distance Preserving Maps for Privacy Preserving Data Mining , 2006, PKDD.

[21]  Minghua Chen,et al.  Enabling Multilevel Trust in Privacy Preserving Data Mining , 2011, IEEE Transactions on Knowledge and Data Engineering.

[22]  Sumit Sarkar,et al.  A Tree-Based Data Perturbation Approach for Privacy-Preserving Data Mining , 2006, IEEE Transactions on Knowledge and Data Engineering.

[23]  L. Brankovic,et al.  DETECTIVE: a decision tree based categorical value clustering and perturbation technique for preserving privacy in data mining , 2005, INDIN '05. 2005 3rd IEEE International Conference on Industrial Informatics, 2005..

[24]  Ninghui Li,et al.  t-Closeness: Privacy Beyond k-Anonymity and l-Diversity , 2007, 2007 IEEE 23rd International Conference on Data Engineering.

[25]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[26]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[27]  Wenliang Du,et al.  Using randomized response techniques for privacy-preserving data mining , 2003, KDD '03.

[28]  Qi Wang,et al.  Random-data perturbation techniques and privacy-preserving data mining , 2005, Knowledge and Information Systems.

[29]  Christian Callegari,et al.  Advances in Computing, Communications and Informatics (ICACCI) , 2015 .