Helen: Maliciously Secure Coopetitive Learning for Linear Models

Many organizations wish to collaboratively train machine learning models on their combined datasets for a common benefit (e.g., better medical research, or fraud detection). However, they often cannot share their plaintext datasets due to privacy concerns and/or business competition. In this paper, we design and build Helen, a system that allows multiple parties to train a linear model without revealing their data, a setting we call coopetitive learning. Compared to prior secure training systems, Helen protects against a much stronger adversary who is malicious and can compromise m−1 out of m parties. Our evaluation shows that Helen can achieve up to five orders of magnitude of performance improvement when compared to training using an existing state-of-the-art secure multi-party computation framework.

[1]  Andrew Chi-Chih Yao,et al.  Protocols for secure computations , 1982, FOCS 1982.

[2]  Richard Cleve,et al.  Limits on the security of coin flips when half the processors are faulty , 1986, STOC '86.

[3]  Silvio Micali,et al.  How to play ANY mental game , 1987, STOC.

[4]  Avi Wigderson,et al.  Completeness theorems for non-cryptographic fault-tolerant distributed computation , 1988, STOC '88.

[5]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[6]  Ivan Damgård,et al.  Multiparty Computation from Threshold Homomorphic Encryption , 2000, EUROCRYPT.

[7]  Fabrice Boudot,et al.  Efficient Proofs that a Committed Number Lies in an Interval , 2000, EUROCRYPT.

[8]  Jacques Stern,et al.  Sharing Decryption in the Context of Voting or Lotteries , 2000, Financial Cryptography.

[9]  Ivan Damgård,et al.  Efficient Concurrent Zero-Knowledge in the Auxiliary String Model , 2000, EUROCRYPT.

[10]  Ivan Damgård,et al.  Client/Server Tradeoffs for Online Elections , 2002, Public Key Cryptography.

[11]  Juan A. Garay,et al.  Strengthening Zero-Knowledge Protocols Using Signatures , 2003, Journal of Cryptology.

[12]  H. Robbins A Stochastic Approximation Method , 1951 .

[13]  Dan Bogdanov,et al.  Sharemind: A Framework for Fast Privacy-Preserving Computations , 2008, ESORICS.

[14]  Benny Pinkas,et al.  FairplayMP: a system for secure multi-party computation , 2008, CCS.

[15]  Jens Groth Homomorphic Trapdoor Commitments to Group Elements , 2009, IACR Cryptol. ePrint Arch..

[16]  G. D'Angelo,et al.  Combining least absolute shrinkage and selection operator (LASSO) and principal-components analysis for detection of gene-gene interactions in genome-wide association studies , 2009, BMC proceedings.

[17]  Peter Norvig,et al.  The Unreasonable Effectiveness of Data , 2009, IEEE Intelligent Systems.

[18]  Thierry Bertin-Mahieux,et al.  The Million Song Dataset , 2011, ISMIR.

[19]  Stephen P. Boyd,et al.  Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers , 2011, Found. Trends Mach. Learn..

[20]  S. Fienberg,et al.  Secure multiple linear regression based on homomorphic encryption , 2011 .

[21]  Ivan Damgård,et al.  Multiparty Computation from Somewhat Homomorphic Encryption , 2012, IACR Cryptol. ePrint Arch..

[22]  Andreas Haeberlen,et al.  DJoin: differentially private join queries over distributed databases , 2012, OSDI 2012.

[23]  Markulf Kohlweiss,et al.  On the Non-malleability of the Fiat-Shamir Transform , 2012, INDOCRYPT.

[24]  Ran Canetti,et al.  Security and composition of cryptographic protocols: a tutorial (part I) , 2006, SIGA.

[25]  Stratis Ioannidis,et al.  Privacy-Preserving Ridge Regression on Hundreds of Millions of Records , 2013, 2013 IEEE Symposium on Security and Privacy.

[26]  Martin J. Wainwright,et al.  Local Privacy, Data Processing Inequalities, and Statistical Minimax Rates , 2013, 1302.3203.

[27]  Carlos V. Rozas,et al.  Innovative instructions and software model for isolated execution , 2013, HASP '13.

[28]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[29]  Christos Gkantsidis,et al.  VC3: Trustworthy Data Analytics in the Cloud Using SGX , 2015, 2015 IEEE Symposium on Security and Privacy.

[30]  Shafi Goldwasser,et al.  Machine Learning Classification over Encrypted Data , 2015, NDSS.

[31]  Martine De Cock,et al.  Fast, Privacy Preserving Linear Regression over Distributed Datasets based on Pre-Distributed Data , 2015, AISec@CCS.

[32]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[33]  Srinivas Devadas,et al.  Intel SGX Explained , 2016, IACR Cryptol. ePrint Arch..

[34]  Mark Abney,et al.  A LASSO penalized regression approach for genome-wide association analyses using related individuals: application to the Genetic Analysis Workshop 19 simulated data , 2016, BMC Proceedings.

[35]  Marcel Keller,et al.  MASCOT: Faster Malicious Arithmetic Secure Computation with Oblivious Transfer , 2016, IACR Cryptol. ePrint Arch..

[36]  Abel N. Kho,et al.  SMCQL: Secure Query Processing for Private Data Networks , 2016, Proc. VLDB Endow..

[37]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[38]  Emmett Witchel,et al.  Ryoan: A Distributed Sandbox for Untrusted Computation on Secret Data , 2016, OSDI.

[39]  Hongmei Chen,et al.  The Study of Credit Scoring Model Based on Group Lasso , 2017, ITQM.

[40]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[41]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[42]  Mariana Raykova,et al.  Privacy-Preserving Distributed Linear Regression on High-Dimensional Data , 2017, Proc. Priv. Enhancing Technol..

[43]  Farinaz Koushanfar,et al.  Chameleon: A Hybrid Secure Computation Framework for Machine Learning Applications , 2018, IACR Cryptol. ePrint Arch..

[44]  Jonathan Katz,et al.  Global-Scale Secure Multiparty Computation , 2017, CCS.

[45]  Úlfar Erlingsson,et al.  Prochlo: Strong Privacy for Analytics in the Crowd , 2017, SOSP.

[46]  Marcel Keller,et al.  Overdrive: Making SPDZ Great Again , 2018, IACR Cryptol. ePrint Arch..

[47]  Somesh Jha,et al.  Privacy-Preserving Ridge Regression with only Linearly-Homomorphic Encryption , 2018, IACR Cryptol. ePrint Arch..

[48]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[49]  Dan Boneh,et al.  Prio: Private, Robust, and Scalable Computation of Aggregate Statistics , 2017, NSDI.

[50]  Randy H. Katz,et al.  A Berkeley View of Systems Challenges for AI , 2017, ArXiv.

[51]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[52]  Zhicong Huang,et al.  UnLynx: A Decentralized System for Privacy-Conscious Data Sharing , 2017, Proc. Priv. Enhancing Technol..

[53]  Marcus Peinado,et al.  Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing , 2016, USENIX Security Symposium.

[54]  Farinaz Koushanfar,et al.  DeepSecure: Scalable Provably-Secure Deep Learning , 2017, 2018 55th ACM/ESDA/IEEE Design Automation Conference (DAC).

[55]  A. Kidd,et al.  Survival prediction in mesothelioma using a scalable Lasso regression model: instructions for use and initial performance using clinical predictors , 2018, BMJ Open Respiratory Research.

[56]  Thomas F. Wenisch,et al.  Foreshadow: Extracting the Keys to the Intel SGX Kingdom with Transient Out-of-Order Execution , 2018, USENIX Security Symposium.

[57]  Brendan Dolan-Gavitt,et al.  Fine-Pruning: Defending Against Backdooring Attacks on Deep Neural Networks , 2018, RAID.

[58]  Anantha Chandrakasan,et al.  Gazelle: A Low Latency Framework for Secure Neural Network Inference , 2018, IACR Cryptol. ePrint Arch..

[59]  Chang Liu,et al.  Manipulating Machine Learning: Poisoning Attacks and Countermeasures for Regression Learning , 2018, 2018 IEEE Symposium on Security and Privacy (SP).

[60]  Úlfar Erlingsson,et al.  The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets , 2018, ArXiv.

[61]  Dawn Song,et al.  Towards Practical Differentially Private Convex Optimization , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[62]  Jean-Pierre Hubaux,et al.  Drynx: Decentralized, Secure, Verifiable System for Statistical Queries and Machine Learning on Distributed Datasets , 2019, IEEE Transactions on Information Forensics and Security.

[63]  Paulo Tabuada,et al.  Cloud-Based Quadratic Optimization With Partially Homomorphic Encryption , 2018, IEEE Transactions on Automatic Control.