A Dual Perturbation Approach for Differential Private ADMM-Based Distributed Empirical Risk Minimization

The rapid growth of data has raised the importance of privacy-preserving techniques in distributed machine learning. In this paper, we develop a privacy-preserving method to a class of regularized empirical risk minimization (ERM) machine learning problems. We first decentralize the learning algorithm using the alternating direction method of multipliers (ADMM), and propose the method of dual variable perturbation to provide dynamic differential privacy. The mechanism leads to a privacy-preserving algorithm under mild conditions of the convexity and differentiability of the loss function and the regularizer. We study the performance of the algorithm measured by the number of data points required to achieve a bounded error. To design an optimal privacy mechanism, we analyze the fundamental tradeoff between privacy and accuracy, and provide guidelines to choose privacy parameters. Numerical experiments using the real-world database are performed to corroborate the results on the privacy and utility tradeoffs and design.

[1]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[2]  William E. Winkler,et al.  Multiplicative Noise for Masking Continuous Data , 2001 .

[3]  Michael Collins,et al.  Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms , 2002, EMNLP.

[4]  Nathan Srebro,et al.  SVM optimization: inverse dependence on training set size , 2008, ICML '08.

[5]  Alexander J. Smola,et al.  Scaling Distributed Machine Learning with the Parameter Server , 2014, OSDI.

[6]  Leslie G. Valiant,et al.  A theory of the learnable , 1984, STOC '84.

[7]  Kunal Talwar,et al.  Mechanism Design via Differential Privacy , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[8]  Cynthia Dwork,et al.  Practical privacy: the SuLQ framework , 2005, PODS.

[9]  Vitaly Shmatikov,et al.  Robust De-anonymization of Large Sparse Datasets , 2008, 2008 IEEE Symposium on Security and Privacy (sp 2008).

[10]  Yu-Hong Dai,et al.  A perfect example for the BFGS method , 2013, Math. Program..

[11]  Sanjay Ghemawat,et al.  MapReduce: Simplified Data Processing on Large Clusters , 2004, OSDI.

[12]  Sofya Raskhodnikova,et al.  What Can We Learn Privately? , 2008, 2008 49th Annual IEEE Symposium on Foundations of Computer Science.

[13]  Sofya Raskhodnikova,et al.  Smooth sensitivity and sampling in private data analysis , 2007, STOC '07.

[14]  Peter Chilstrom Singular Value Inequalities: New Approaches to Conjectures , 2013 .

[15]  Georgios B. Giannakis,et al.  Consensus-based distributed linear support vector machines , 2010, IPSN '10.

[16]  Daniel A. Spielman,et al.  Spectral Graph Theory and its Applications , 2007, 48th Annual IEEE Symposium on Foundations of Computer Science (FOCS'07).

[17]  Cynthia Dwork,et al.  Calibrating Noise to Sensitivity in Private Data Analysis , 2006, TCC.

[18]  Gideon S. Mann,et al.  Distributed Training Strategies for the Structured Perceptron , 2010, NAACL.

[19]  Alexandre V. Evfimievski,et al.  Privacy preserving mining of association rules , 2002, Inf. Syst..

[20]  Georgios B. Giannakis,et al.  Consensus-Based Distributed Support Vector Machines , 2010, J. Mach. Learn. Res..