Chiron: Privacy-preserving Machine Learning as a Service

Major cloud operators offer machine learning (ML) as a service, enabling customers who have the data but not ML expertise or infrastructure to train predictive models on this data. Existing ML-as-a-service platforms require users to reveal all training data to the service operator. We design, implement, and evaluate Chiron, a system for privacy-preserving machine learning as a service. First, Chiron conceals the training data from the service operator. Second, in keeping with how many existing ML-as-a-service platforms work, Chiron reveals neither the training algorithm nor the model structure to the user, providing only black-box access to the trained model. Chiron is implemented using SGX enclaves, but SGX alone does not achieve the dual goals of data privacy and model confidentiality. Chiron runs the standard ML training toolchain (including the popular Theano framework and C compiler) in an enclave, but the untrusted model-creation code from the service operator is further confined in a Ryoan sandbox to prevent it from leaking the training data outside the enclave. To support distributed training, Chiron executes multiple concurrent enclaves that exchange model parameters via a parameter server. We evaluate Chiron on popular deep learning models, focusing on benchmark image classification tasks such as CIFAR and ImageNet, and show that its training performance and accuracy of the resulting models are practical for common uses of ML-as-a-service.

[1]  Yehuda Lindell,et al.  Privacy Preserving Data Mining , 2000, Journal of Cryptology.

[2]  Chris Clifton,et al.  Privacy-preserving Naïve Bayes classification , 2008, The VLDB Journal.

[3]  Vitaly Shmatikov,et al.  Privacy-preserving remote diagnostics , 2007, CCS '07.

[4]  Neha Narula,et al.  Native Client: A Sandbox for Portable, Untrusted x86 Native Code , 2009, IEEE Symposium on Security and Privacy.

[5]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[6]  Fei-Fei Li,et al.  ImageNet: A large-scale hierarchical image database , 2009, 2009 IEEE Conference on Computer Vision and Pattern Recognition.

[7]  Vitaly Shmatikov,et al.  Privacy-Preserving Classifier Learning , 2009, Financial Cryptography.

[8]  Bennet S. Yee,et al.  Adapting Software Fault Isolation to Contemporary CPU Architectures , 2010, USENIX Security Symposium.

[9]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[10]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[11]  Hovav Shacham,et al.  Iago attacks: why the system call API is a bad untrusted RPC interface , 2013, ASPLOS '13.

[12]  Galen C. Hunt,et al.  Shielding Applications from an Untrusted Cloud with Haven , 2014, OSDI.

[13]  Trishul M. Chilimbi,et al.  Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.

[14]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, Allerton.

[15]  Sergey Ioffe,et al.  Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift , 2015, ICML.

[16]  Christos Gkantsidis,et al.  VC3: Trustworthy Data Analytics in the Cloud Using SGX , 2015, 2015 IEEE Symposium on Security and Privacy.

[17]  Zheng Zhang,et al.  MXNet: A Flexible and Efficient Machine Learning Library for Heterogeneous Distributed Systems , 2015, ArXiv.

[18]  Kenta Oono,et al.  Chainer : a Next-Generation Open Source Framework for Deep Learning , 2015 .

[19]  Shafi Goldwasser,et al.  Machine Learning Classification over Encrypted Data , 2015, NDSS.

[20]  Marcus Peinado,et al.  Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems , 2015, 2015 IEEE Symposium on Security and Privacy.

[21]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[22]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[23]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[24]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[25]  David M. Eyers,et al.  SCONE: Secure Linux Containers with Intel SGX , 2016, OSDI.

[26]  Eric P. Xing,et al.  GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server , 2016, EuroSys.

[27]  Sebastian Nowozin,et al.  Oblivious Multi-Party Machine Learning on Trusted Processors , 2016, USENIX Security Symposium.

[28]  Frank Piessens,et al.  Ariadne: A Minimal Approach to State Continuity , 2016, USENIX Security Symposium.

[29]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[30]  Jeffrey S. Chase,et al.  CQSTR: Securing Cross-Tenant Applications with Cloud Containers , 2016, SoCC.

[31]  Emmett Witchel,et al.  Ryoan: A Distributed Sandbox for Untrusted Computation on Secret Data , 2016, OSDI.

[32]  Vitaly Shmatikov,et al.  Machine Learning Models that Remember Too Much , 2017, CCS.

[33]  Yao Lu,et al.  Oblivious Neural Network Predictions via MiniONN Transformations , 2017, IACR Cryptol. ePrint Arch..

[34]  Donald E. Porter,et al.  Graphene-SGX: A Practical Library OS for Unmodified Applications on SGX , 2017, USENIX Annual Technical Conference.

[35]  Geoffrey E. Hinton,et al.  Outrageously Large Neural Networks: The Sparsely-Gated Mixture-of-Experts Layer , 2017, ICLR.

[36]  Marcus Peinado,et al.  T-SGX: Eradicating Controlled-Channel Attacks Against Enclave Programs , 2017, NDSS.

[37]  Carl A. Gunter,et al.  Leaky Cauldron on the Dark Land: Understanding Memory Side-Channel Hazards in SGX , 2017, CCS.

[38]  Srdjan Capkun,et al.  ROTE: Rollback Protection for Trusted Execution , 2017, USENIX Security Symposium.

[39]  Andrew Baumann Hardware is the new Software , 2017, HotOS.

[40]  Ion Stoica,et al.  Opaque: An Oblivious and Encrypted Distributed Analytics Platform , 2017, NSDI.

[41]  Srdjan Capkun,et al.  Software Grand Exposure: SGX Cache Attacks Are Practical , 2017, WOOT.

[42]  Michael K. Reiter,et al.  Detecting Privileged Side-Channel Attacks in Shielded Execution with Déjà Vu , 2017, AsiaCCS.

[43]  Marcus Peinado,et al.  Inferring Fine-grained Control Flow Inside SGX Enclaves with Branch Shadowing , 2016, USENIX Security Symposium.

[44]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[45]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).