Differential Privacy Principal Component Analysis for Support Vector Machines

In big data era, massive and high-dimensional data is produced at all times, increasing the difficulty of analyzing and protecting data. In this paper, in order to realize dimensionality reduction and privacy protection of data, principal component analysis (PCA) and differential privacy (DP) are combined to handle these data. Moreover, support vector machine (SVM) is used to measure the availability of processed data in our paper. Specifically, we introduced differential privacy mechanisms at different stages of the algorithm PCA-SVM and obtained the algorithms DPPCA-SVM and PCADP-SVM. Both algorithms satisfy (ε, 0)-DP while achieving fast classification. In addition, we evaluate the performance of two algorithms in terms of noise expectation and classification accuracy from the perspective of theoretical proof and experimental verification. To verify the performance of DPPCA-SVM, we also compare our DPPCA-SVM with other algorithms. Results show that DPPCA-SVM provides excellent utility for different data sets despite guaranteeing stricter privacy.

[1]  Anand D. Sarwate,et al.  Differentially Private Distributed Principal Component Analysis , 2018, 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[2]  Sergios Theodoridis,et al.  Complex Support Vector Machines for Regression and Quaternary Classification , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[3]  Ju Ren,et al.  DPPro: Differentially Private High-Dimensional Data Release via Random Projection , 2017, IEEE Transactions on Information Forensics and Security.

[4]  Kunal Talwar,et al.  On differentially private low rank approximation , 2013, SODA.

[5]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[6]  Anand D. Sarwate,et al.  Symmetric matrix perturbation for differentially-private principal component analysis , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[7]  Fang Liu,et al.  Generalized Gaussian Mechanism for Differential Privacy , 2016, IEEE Transactions on Knowledge and Data Engineering.

[8]  Alfred O. Hero,et al.  Decomposable Principal Component Analysis , 2009, IEEE Transactions on Signal Processing.

[9]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[10]  Josep Domingo-Ferrer,et al.  Individual Differential Privacy: A Utility-Preserving Formulation of Differential Privacy Guarantees , 2016, IEEE Transactions on Information Forensics and Security.

[11]  Anand D. Sarwate,et al.  A near-optimal algorithm for differentially-private principal components , 2012, J. Mach. Learn. Res..

[12]  Amos Beimel,et al.  Private Learning and Sanitization: Pure vs. Approximate Differential Privacy , 2013, APPROX-RANDOM.

[13]  Yahong Xu,et al.  Laplace Input and Output Perturbation for Differentially Private Principal Components Analysis , 2019, Secur. Commun. Networks.

[14]  Li Zhang,et al.  Analyze gauss: optimal bounds for privacy-preserving principal component analysis , 2014, STOC.

[15]  Farhad Farokhi,et al.  Privacy-Preserving Public Release of Datasets for Support Vector Machine Classification , 2019, IEEE Transactions on Big Data.

[16]  Zhihua Zhang,et al.  Wishart Mechanism for Differentially Private Principal Components Analysis , 2015, AAAI.