Review of the state-of-the-art methods for Privacy Preserved Classification in Outsourced Environment

An outsourced environment like cloud has become a default choice for organizations to store their data because of its extensive resources. This also allows the organizations to access their data on demand. Generally, any cloud service encrypts data collected from different data owners with its own credentials. In some special cases, when the data is sensitive, the data owners encrypt their data prior to outsourcing it to ensure confidentiality. But there is no provision for processing the encrypted data within the cloud environment without ever decrypting it. So, any data mining task such as classification over this encrypted data will require the data to be decrypted at some point of time. Moreover, privacy of any user's classification query is also at stake. As a consequence, for classification, the data needs to be decrypted by the cloud at some point of time and then processed to take proper classification decisions. However, this allows the cloud to learn about the sensitive data. Data owners have no choice but to perform the same task at their end partially or fully. But, they are reluctant to perform such heavy computations locally. This creates a need for mechanism that will perform classification over the encrypted data in an outsourced environment while maintaining its privacy. The conventional encryption methods do not allow operations on the encrypted data directly and hence are not useful in such classification problems. In this paper, we discuss the current status and problem associated with the various existing methods to solve the classification over encrypted data problem for privacy preservation assuming that encrypted data and the classification process are outsourced to the cloud. We also analyze pallier homomorphic encryption method to see how privacy of sensitive data uploaded to the cloud can be preserved by leveraging the properties of paillier homomorphic encryption.

[1]  Adi Shamir,et al.  A method for obtaining digital signatures and public-key cryptosystems , 1978, CACM.

[2]  Craig Gentry,et al.  Fully homomorphic encryption using ideal lattices , 2009, STOC '09.

[3]  Rajarshi Shahu,et al.  K-Nearest Neighbor Classification over Semantically Secure Encrypted Relational Data , 2016 .

[4]  Krishnaram Kenthapadi,et al.  LinkedIn Salary: A System for Secure Collection and Presentation of Structured Compensation Insights to Job Seekers , 2017, 2017 IEEE Symposium on Privacy-Aware Computing (PAC).

[5]  Craig Gentry,et al.  A fully homomorphic encryption scheme , 2009 .

[6]  Chunxiao Jiang,et al.  Information Security in Big Data: Privacy and Data Mining , 2014, IEEE Access.

[7]  Sara Bouchenak,et al.  Towards Dynamic End-to-End Privacy Preserving Data Classification , 2018, 2018 48th Annual IEEE/IFIP International Conference on Dependable Systems and Networks Workshops (DSN-W).

[8]  George Sibiya,et al.  Classification and evaluation of Privacy Preserving Data Mining: A review , 2017, 2017 IEEE AFRICON.

[9]  Wei Jiang,et al.  Secure k-nearest neighbor query over encrypted data in outsourced environments , 2013, 2014 IEEE 30th International Conference on Data Engineering.

[10]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[11]  Deepak H. Sharma,et al.  Homomorphic Encryption for Security of Cloud Data , 2016 .

[12]  Dongxi Liu,et al.  Privacy-Preserving and Outsourced Multi-user K-Means Clustering , 2014, 2015 IEEE Conference on Collaboration and Internet Computing (CIC).

[13]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[14]  Nikos Mamoulis,et al.  Secure kNN computation on encrypted databases , 2009, SIGMOD Conference.

[15]  Ximeng Liu,et al.  An Efficient Privacy-Preserving Outsourced Calculation Toolkit With Multiple Keys , 2016, IEEE Transactions on Information Forensics and Security.