Aggregating Private Sparse Learning Models Using Multi-Party Computation

We consider the problem of privately learning a sparse model across multiple sensitive datasets, and propose learning individual models locally and privately aggregating them using secure multi-party computation. In this paper, we report some preliminary experiments on distributed sparse linear discriminant analysis, showing both the feasibility and effectiveness of our approach on experiments using heart disease data collected across four hospitals.

[1]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[2]  A. Yao,et al.  Fair exchange with a semi-trusted third party (extended abstract) , 1997, CCS '97.

[3]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[4]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2002, Journal of Cryptology.

[5]  Yuval Ishai,et al.  Extending Oblivious Transfers Efficiently , 2003, CRYPTO.

[6]  Cynthia Dwork,et al.  Privacy-Preserving Datamining on Vertically Partitioned Databases , 2004, CRYPTO.

[7]  Sheng Zhong,et al.  Privacy-Preserving Classification of Customer Data without Loss of Accuracy , 2005, SDM.

[8]  Chris Clifton,et al.  Privacy-preserving Naïve Bayes classification , 2008, The VLDB Journal.

[9]  Vladimir Kolesnikov,et al.  Improved Garbled Circuit: Free XOR Gates and Applications , 2008, ICALP.

[10]  Cynthia Dwork,et al.  Differential Privacy: A Survey of Results , 2008, TAMC.

[11]  Kamalika Chaudhuri,et al.  Privacy-preserving logistic regression , 2008, NIPS.

[12]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[13]  Gideon S. Mann,et al.  Efficient Large-Scale Distributed Training of Conditional Maximum Entropy Models , 2009, NIPS.

[14]  Benny Pinkas,et al.  Secure Two-Party Computation is Practical , 2009, IACR Cryptol. ePrint Arch..

[15]  Bhiksha Raj,et al.  Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers , 2010, NIPS.

[16]  Jonathan Katz,et al.  Faster Secure Two-Party Computation Using Garbled Circuits , 2011, USENIX Security Symposium.

[17]  J. Shao,et al.  Sparse linear discriminant analysis by thresholding for high dimensional data , 2011, 1105.3561.

[18]  T. Cai,et al.  A Direct Estimation Approach to Sparse Linear Discriminant Analysis , 2011, 1107.3442.

[19]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[20]  Martin J. Wainwright,et al.  Communication-efficient algorithms for statistical optimization , 2012, 2012 IEEE 51st IEEE Conference on Decision and Control (CDC).

[21]  Maria-Florina Balcan,et al.  Distributed Learning, Communication Complexity and Privacy , 2012, COLT.

[22]  H. Zou,et al.  A direct approach to sparse discriminant analysis in ultra-high dimensions , 2012 .

[23]  Daniel Kifer,et al.  Private Convex Empirical Risk Minimization and High-dimensional Regression , 2012, COLT 2012.

[24]  Pravesh Kothari,et al.  25th Annual Conference on Learning Theory Differentially Private Online Learning , 2022 .

[25]  Yang Feng,et al.  A road to classification in high dimensional space: the regularized optimal affine discriminant , 2010, Journal of the Royal Statistical Society. Series B, Statistical methodology.

[26]  Adam D. Smith,et al.  Differentially Private Feature Selection via Stability Arguments, and the Robustness of the Lasso , 2013, COLT.

[27]  Prateek Jain,et al.  Differentially Private Learning with Kernels , 2013, ICML.

[28]  Jonathan D. Rosenblatt,et al.  On the Optimality of Averaging in Distributed Statistical Learning , 2014, 1407.2724.

[29]  S. Geer,et al.  On asymptotically optimal confidence regions and tests for high-dimensional models , 2013, 1303.0518.

[30]  Adel Javanmard,et al.  Confidence intervals and hypothesis testing for high-dimensional regression , 2013, J. Mach. Learn. Res..

[31]  David Evans,et al.  Obliv-C: A Language for Extensible Data-Oblivious Computation , 2015, IACR Cryptol. ePrint Arch..

[32]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[33]  Li Zhang,et al.  Nearly Optimal Private LASSO , 2015, NIPS.

[34]  Martin J. Wainwright,et al.  Divide and conquer kernel ridge regression: a distributed algorithm with minimax optimal rates , 2013, J. Mach. Learn. Res..

[35]  Qiang Liu,et al.  Communication-efficient sparse regression: a one-shot approach , 2015, ArXiv.

[36]  David Evans,et al.  Two Halves Make a Whole - Reducing Data Transfer in Garbled Circuits Using Half Gates , 2015, EUROCRYPT.

[37]  Jianqing Fan,et al.  Distributed Estimation and Inference with Statistical Guarantees , 2015, 1509.05457.

[38]  Mikhail Belkin,et al.  Learning privately from multiparty data , 2016, ICML.

[39]  Lu Tian,et al.  Communication-efficient Distributed Sparse Linear Discriminant Analysis , 2016, AISTATS.