A Differential Privacy Support Vector Machine Classifier Based on Dual Variable Perturbation

Data mining technology can be used to dig out potential and valuable information from massive data, and support vector machine (SVM) is one of the most widely used and most efficient methods in the field of data mining classification. However, the training set data often contains sensitive attributes, and the traditional training method of SVM reveals the individual privacy information. In view of the low prediction accuracy and poor versatility of the existing SVM classifiers with privacy protection, this paper proposed a new SVM training method for differential privacy protection. The algorithm first solved the dual problem of SVM by using SMO method and the difference <inline-formula> <tex-math notation="LaTeX">$E_{i}$ </tex-math></inline-formula> between the estimated value and the real value for each support vector was recorded. Then the ratio of the <inline-formula> <tex-math notation="LaTeX">$E_{i}$ </tex-math></inline-formula> of each support vector to the sum of the <inline-formula> <tex-math notation="LaTeX">$E_{i}$ </tex-math></inline-formula> of all the support vectors was calculated. Next, different levels of Laplace random noise were added to the corresponding dual variables <inline-formula> <tex-math notation="LaTeX">$\alpha _{i}$ </tex-math></inline-formula> of each support vector to be released, according to the ratio of each support vector. According to the principle of differential privacy protection, the algorithm meets <inline-formula> <tex-math notation="LaTeX">$\epsilon $ </tex-math></inline-formula>-differential privacy which can be used to effectively protect individual privacy. Experimental results on real datasets showed that the algorithm proposed in this paper could be used for classification prediction under a reasonable privacy budget.

[1]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[2]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[3]  Shunxiang Zhang,et al.  An enhanced l-diversity privacy preservation , 2013, 2013 10th International Conference on Fuzzy Systems and Knowledge Discovery (FSKD).

[4]  Tao Li,et al.  Private classification with limited labeled data , 2017, Knowl. Based Syst..

[5]  João P. Vilela,et al.  Privacy-Preserving Data Mining: Methods, Metrics, and Applications , 2017, IEEE Access.

[6]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[7]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[8]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[9]  M. Hochman Geometric rigidity of $\times m$ invariant measures , 2010, 1008.3548.

[10]  T. Ramakrishnan,et al.  A professional estimate on the computed tomography brain tumor images using SVM-SMO for classification and MRG-GWO for segmentation , 2017, Pattern Recognit. Lett..

[11]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[12]  Latanya Sweeney,et al.  k-Anonymity: A Model for Protecting Privacy , 2002, Int. J. Uncertain. Fuzziness Knowl. Based Syst..

[13]  Ling Huang,et al.  Learning in a Large Function Space: Privacy-Preserving Mechanisms for SVM Learning , 2009, J. Priv. Confidentiality.

[14]  Han Wang,et al.  Differential Private Multiple Classification Algorithm for SVM , 2018, 2018 5th IEEE International Conference on Cloud Computing and Intelligence Systems (CCIS).

[15]  Weilin Nie,et al.  Perturbation of convex risk minimization and its application in differential private learning algorithms , 2017, Journal of inequalities and applications.

[16]  Chih-Jen Lin,et al.  LIBSVM: A library for support vector machines , 2011, TIST.

[17]  Xiaoqian Jiang,et al.  Privacy Preserving RBF Kernel Support Vector Machine , 2014, BioMed research international.

[18]  Feihu Qi,et al.  SVM Model Selection with the VC Bound , 2004, CIS.

[19]  Makhamisa Senekane,et al.  Differentially Private Image Classification Using Support Vector Machine and Differential Privacy , 2019, Mach. Learn. Knowl. Extr..

[20]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[21]  Frank McSherry,et al.  Privacy integrated queries: an extensible platform for privacy-preserving data analysis , 2009, SIGMOD Conference.

[22]  Dejing Dou,et al.  Adaptive Laplace Mechanism: Differential Privacy Preservation in Deep Learning , 2017, 2017 IEEE International Conference on Data Mining (ICDM).

[23]  Yuji Matsumoto,et al.  Chunking with Support Vector Machines , 2001, NAACL.

[24]  Prateek Jain,et al.  Differentially Private Learning with Kernels , 2013, ICML.