Golden Grain: Building a Secure and Decentralized Model Marketplace for MLaaS

ML-as-a-service (MLaaS) becomes increasingly popular and revolutionizes the lives of people. A natural requirement for MLaaS is, however, to provide highly accurate prediction services. To achieve this, current MLaaS systems integrate and combine multiple well-trained models in their services. However, in reality, there is no easy way for MLaaS providers, especially for startups, to collect well-trained models from individual developers, due to the lack of incentives. In this paper, we aim to fill this gap by building a model marketplace, called as GoldenGrain, to facilitate model sharing, which enforces the fair model-money swaps between individual developers and MLaaS providers. Specifically, we deploy the swapping process on the blockchain, and further introduce a blockchain-empowered model benchmarking design for transparently determining the model prices according to their authentic performances so as to incentivize the faithful contributions of well-trained models. Especially, to ease the blockchain overhead for benchmarking, our marketplace carefully offloads the heavy computation and crafts a trusted execution environment (TEE) based secure off-chain on-chain interaction protocol, ensuring both the integrity and authenticity of benchmarking. We implement a prototype of our GoldenGrain on the Ethereum blockchain, and extensive experiments with standard benchmark datasets demonstrate the practically affordable performance of our design.

[1]  Silvio Micali,et al.  Probabilistic Encryption , 1984, J. Comput. Syst. Sci..

[2]  J. Bard Some properties of the bilevel programming problem , 1991 .

[3]  John H. Holland,et al.  Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .

[4]  N. H. Anderson,et al.  Two-sample test statistics for measuring discrepancies between two multivariate probability density functions using kernel-based density estimates , 1994 .

[5]  Henning Pagnia,et al.  On the Impossibility of Fair Exchange without a Trusted Third Party , 1999 .

[6]  Oded Goldreich,et al.  The Foundations of Cryptography - Volume 2: Basic Applications , 2001 .

[7]  S. R. Hejazia,et al.  Linear bilevel programming solution by genetic algorithm , 2002 .

[8]  Rajkumar Roy,et al.  Bi-level optimisation using genetic algorithm , 2002, Proceedings 2002 IEEE International Conference on Artificial Intelligence Systems (ICAIS 2002).

[9]  R. Kranton Competition and the Incentive to Produce High Quality , 2003 .

[10]  Oded Goldreich,et al.  Foundations of Cryptography: Volume 2, Basic Applications , 2004 .

[11]  Matti Pietikäinen,et al.  Face Description with Local Binary Patterns: Application to Face Recognition , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Herminia I. Calvete,et al.  A new approach for solving linear bilevel problems using genetic algorithms , 2008, Eur. J. Oper. Res..

[13]  Adem Karahoca Advances in Data Mining Knowledge Discovery and Applications , 2012 .

[14]  Marcel Jirina,et al.  Selecting Representative Data Sets , 2012 .

[15]  Juan del Cuvillo,et al.  Using innovative instructions to create trustworthy software solutions , 2013, HASP '13.

[16]  Craig Gentry,et al.  Pinocchio: Nearly Practical Verifiable Computation , 2013, IEEE Symposium on Security and Privacy.

[17]  José-Fernando Camacho-Vallejo,et al.  A Genetic Algorithm for the Bi-Level Topological Design of Local Area Networks , 2015, PloS one.

[18]  Shweta Shinde,et al.  Preventing Your Faults From Telling Your Secrets: Defenses Against Pigeonhole Attacks , 2015, ArXiv.

[19]  Marcus Peinado,et al.  Controlled-Channel Attacks: Deterministic Side Channels for Untrusted Operating Systems , 2015, 2015 IEEE Symposium on Security and Privacy.

[20]  Ashay Rane,et al.  Raccoon: Closing Digital Side-Channels through Obfuscated Execution , 2015, USENIX Security Symposium.

[21]  Michael S. Bernstein,et al.  ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[22]  Miriam A. M. Capretz,et al.  MLaaS: Machine Learning as a Service , 2015, 2015 IEEE 14th International Conference on Machine Learning and Applications (ICMLA).

[23]  Fan Zhang,et al.  Town Crier: An Authenticated Data Feed for Smart Contracts , 2016, CCS.

[24]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[25]  Carlos V. Rozas,et al.  Intel® Software Guard Extensions (Intel® SGX) Support for Dynamic Memory Management Inside an Enclave , 2016, HASP 2016.

[26]  Srinivas Devadas,et al.  Intel SGX Explained , 2016, IACR Cryptol. ePrint Arch..

[27]  Elaine Shi,et al.  Hawk: The Blockchain Model of Cryptography and Privacy-Preserving Smart Contracts , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[28]  日経BP社,et al.  Amazon Web Services完全ソリューションガイド , 2016 .

[29]  Mohan M. Trivedi,et al.  Looking at Humans in the Age of Self-Driving and Highly Automated Vehicles , 2016, IEEE Transactions on Intelligent Vehicles.

[30]  Fan Zhang,et al.  Stealing Machine Learning Models via Prediction APIs , 2016, USENIX Security Symposium.

[31]  Xin Wang,et al.  Clipper: A Low-Latency Online Prediction Serving System , 2016, NSDI.

[32]  Haifei Yu,et al.  Data pricing strategy based on data quality , 2017, Comput. Ind. Eng..

[33]  Ananthram Swami,et al.  Practical Black-Box Attacks against Machine Learning , 2016, AsiaCCS.

[34]  Srdjan Capkun,et al.  ROTE: Rollback Protection for Trusted Execution , 2017, USENIX Security Symposium.

[35]  Mark Silberstein,et al.  Eleos: ExitLess OS Services for SGX Enclaves , 2017, EuroSys.

[36]  Elaine Shi,et al.  Formal Abstractions for Attested Execution Secure Processors , 2017, EUROCRYPT.

[37]  Charlie Bennett,et al.  The Interplanetary File System , 2017 .

[38]  Reza Ghaeini,et al.  A Deep Learning Approach for Cancer Detection and Relevant Gene Identification , 2017, PSB.

[39]  Florian Stahl,et al.  Name Your Own Price on Data Marketplaces , 2017, Informatica.

[40]  M. Goddard The EU General Data Protection Regulation (GDPR): European Regulation that has a Global Impact , 2017 .

[41]  Fan Zhang,et al.  Sealed-Glass Proofs: Using Transparent Enclaves to Prove and Sell Knowledge , 2017, 2017 IEEE European Symposium on Security and Privacy (EuroS&P).

[42]  Christopher Olston,et al.  TensorFlow-Serving: Flexible, High-Performance ML Serving , 2017, ArXiv.

[43]  David A. Wagner,et al.  Towards Evaluating the Robustness of Neural Networks , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[44]  Vitaly Shmatikov,et al.  Membership Inference Attacks Against Machine Learning Models , 2016, 2017 IEEE Symposium on Security and Privacy (SP).

[45]  Percy Liang,et al.  Adversarial Examples for Evaluating Reading Comprehension Systems , 2017, EMNLP.

[46]  Shweta Shinde,et al.  Privado: Practical and Secure DNN Inference , 2018, ArXiv.

[47]  Haifei Yu,et al.  Data pricing strategy based on data quality q , 2018 .

[48]  Kyungtae Kim,et al.  OBLIVIATE: A Data Oblivious Filesystem for Intel SGX , 2018, NDSS.

[49]  Dan Alistarh,et al.  DataBright: Towards a Global Exchange for Decentralized Data Ownership and Trusted Computation , 2018, ArXiv.

[50]  David A. Wagner,et al.  Audio Adversarial Examples: Targeted Attacks on Speech-to-Text , 2018, 2018 IEEE Security and Privacy Workshops (SPW).

[51]  Byung-Gon Chun,et al.  PRETZEL: Opening the Black Box of Machine Learning Prediction Serving Systems , 2018, OSDI.

[52]  Dawn Xiaodong Song,et al.  A Demonstration of Sterling: A Privacy-Preserving Data Marketplace , 2018, Proc. VLDB Endow..

[53]  Aleksander Madry,et al.  Towards Deep Learning Models Resistant to Adversarial Attacks , 2017, ICLR.

[54]  Stefano Ermon,et al.  Adversarial Examples for Natural Language Classification Problems , 2018 .

[55]  Vitaly Shmatikov,et al.  Chiron: Privacy-preserving Machine Learning as a Service , 2018, ArXiv.

[56]  Yuan Lu,et al.  ZebraLancer: Private and Anonymous Crowdsourcing System atop Open Blockchain , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[57]  Cong Wang,et al.  Leveraging Crowdsensed Data Streams to Discover and Sell Knowledge: A Secure and Efficient Realization , 2018, 2018 IEEE 38th International Conference on Distributed Computing Systems (ICDCS).

[58]  Haichen Shen,et al.  TVM: An Automated End-to-End Optimizing Compiler for Deep Learning , 2018, OSDI.

[59]  Úlfar Erlingsson,et al.  The Secret Sharer: Measuring Unintended Neural Network Memorization & Extracting Secrets , 2018, ArXiv.

[60]  Sebastian Nowozin,et al.  Can You Trust Your Model's Uncertainty? Evaluating Predictive Uncertainty Under Dataset Shift , 2019, NeurIPS.

[61]  Jian Weng,et al.  Toward Blockchain-Based Fair and Anonymous Ad Dissemination in Vehicular Networks , 2019, IEEE Transactions on Vehicular Technology.

[62]  Dan Boneh,et al.  Zether: Towards Privacy in a Smart Contract World , 2020, IACR Cryptol. ePrint Arch..

[63]  Ten-Hwang Lai,et al.  OPERA: Open Remote Attestation for Intel's Secure Enclaves , 2019, CCS.

[64]  Thomas G. Dietterich,et al.  Benchmarking Neural Network Robustness to Common Corruptions and Perturbations , 2018, ICLR.

[65]  Dan Boneh,et al.  Slalom: Fast, Verifiable and Private Execution of Neural Networks in Trusted Hardware , 2018, ICLR.

[66]  Po-Sen Huang,et al.  Are Labels Required for Improving Adversarial Robustness? , 2019, NeurIPS.

[67]  Matthew Green,et al.  Giving State to the Stateless: Augmenting Trustworthy Computation with Ledgers , 2019, NDSS.

[68]  Arun Kumar,et al.  Towards Model-based Pricing for Machine Learning in a Data Marketplace , 2018, SIGMOD Conference.

[69]  Tommaso Frassetto,et al.  FastKitten: Practical Smart Contracts on Bitcoin , 2019, IACR Cryptol. ePrint Arch..

[70]  Shai Halevi,et al.  Supporting private data on Hyperledger Fabric with secure multiparty computation , 2019, IBM J. Res. Dev..

[71]  Ludwig Schmidt,et al.  Unlabeled Data Improves Adversarial Robustness , 2019, NeurIPS.

[72]  Fan Zhang,et al.  Ekiden: A Platform for Confidentiality-Preserving, Trustworthy, and Performant Smart Contracts , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[73]  Christof Fetzer,et al.  CoSMIX: A Compiler-based System for Secure Memory Instrumentation and Execution in Enclaves , 2019, USENIX ATC.

[74]  Robert H. Deng,et al.  CrowdBC: A Blockchain-Based Decentralized Framework for Crowdsourcing , 2019, IEEE Transactions on Parallel and Distributed Systems.

[75]  Nic Ford,et al.  Adversarial Examples Are a Natural Consequence of Test Error in Noise , 2019, ICML.

[76]  AugMix: A Simple Data Processing Method to Improve Robustness and Uncertainty , 2019, ICLR.

[77]  Kai Chen,et al.  Confidential Attestation: Efficient in-Enclave Verification of Privacy Policy Compliance , 2020, ArXiv.

[78]  Yue Zhang,et al.  DeepChain: Auditable and Privacy-Preserving Deep Learning with Blockchain-Based Incentive , 2019, IEEE Transactions on Dependable and Secure Computing.

[79]  Cong Wang,et al.  Enabling Reliable Keyword Search in Encrypted Decentralized Storage with Fairness , 2018, IEEE Transactions on Dependable and Secure Computing.

[80]  Cong Wang,et al.  Building a Secure Knowledge Marketplace Over Crowdsensed Data Streams , 2021, IEEE Transactions on Dependable and Secure Computing.