DeepChain: Auditable and Privacy-Preserving Deep Learning with Blockchain-Based Incentive

Deep learning can achieve higher accuracy than traditional machine learning algorithms in a variety of machine learning tasks. Recently, privacy-preserving deep learning has drawn tremendous attention from information security community, in which neither training data nor the training model is expected to be exposed. Federated learning is a popular learning mechanism, where multiple parties upload local gradients to a server and the server updates model parameters with the collected gradients. However, there are many security problems neglected in federated learning, for example, the participants may behave incorrectly in gradient collecting or parameter updating, and the server may be malicious as well. In this paper, we present a distributed, secure, and fair deep learning framework named \textit{DeepChain} to solve these problems. DeepChain provides a value-driven incentive mechanism based on Blockchain to force the participants to behave correctly. Meanwhile, DeepChain guarantees data privacy for each participant and provides auditability for the whole training process. We implement a DeepChain prototype and conduct experiments on a real dataset for different settings, and the results show that our DeepChain is promising.

[1]  Daniel Davis Wood,et al.  ETHEREUM: A SECURE DECENTRALISED GENERALISED TRANSACTION LEDGER , 2014 .

[2]  Shiho Moriai,et al.  Privacy-Preserving Deep Learning via Additively Homomorphic Encryption , 2018, IEEE Transactions on Information Forensics and Security.

[3]  Iddo Bentov,et al.  How to Use Bitcoin to Incentivize Correct Computations , 2014, CCS.

[4]  Suyog Gupta,et al.  Model Accuracy and Runtime Tradeoff in Distributed Deep Learning: A Systematic Study , 2015, 2016 IEEE 16th International Conference on Data Mining (ICDM).

[5]  Stephen J. Wright,et al.  Hogwild: A Lock-Free Approach to Parallelizing Stochastic Gradient Descent , 2011, NIPS.

[6]  Jiwen Lu,et al.  PCANet: A Simple Deep Learning Baseline for Image Classification? , 2014, IEEE Transactions on Image Processing.

[7]  He Ma,et al.  Theano-MPI: A Theano-Based Distributed Training Framework , 2016, Euro-Par Workshops.

[8]  Vitaly Shmatikov,et al.  Machine Learning Models that Remember Too Much , 2017, CCS.

[9]  Gisbert Schneider,et al.  Deep Learning in Drug Discovery , 2016, Molecular informatics.

[10]  Mun Choon Chan,et al.  MobiCent: a Credit-Based Incentive System for Disruption Tolerant Network , 2010, 2010 Proceedings IEEE INFOCOM.

[11]  Shuicheng Yan,et al.  Purine: A bi-graph based deep learning framework , 2015, ICLR.

[12]  Eric P. Xing,et al.  GeePS: scalable deep learning on distributed GPUs with a GPU-specialized parameter server , 2016, EuroSys.

[13]  Emiliano De Cristofaro,et al.  Knock Knock, Who's There? Membership Inference on Aggregate Location Data , 2017, NDSS.

[14]  Georg Heigold,et al.  Multilingual acoustic models using distributed deep neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[15]  Silvio Micali,et al.  Algorand: Scaling Byzantine Agreements for Cryptocurrencies , 2017, IACR Cryptol. ePrint Arch..

[16]  Sheng Zhong,et al.  Privacy preserving Back-propagation neural network learning over arbitrarily partitioned data , 2011, Neural Computing and Applications.

[17]  Tara N. Sainath,et al.  FUNDAMENTAL TECHNOLOGIES IN MODERN SPEECH RECOGNITION Digital Object Identifier 10.1109/MSP.2012.2205597 , 2012 .

[18]  Shucheng Yu,et al.  Privacy Preserving Back-Propagation Learning Made Practical with Cloud Computing , 2012, SecureComm.

[19]  Kouichi Sakurai,et al.  Distributed Paillier Cryptosystem without Trusted Dealer , 2010, WISA.

[20]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[21]  Alexander J. Smola,et al.  Parallelized Stochastic Gradient Descent , 2010, NIPS.

[22]  Carlos V. Rozas,et al.  Innovative instructions and software model for isolated execution , 2013, HASP '13.

[23]  Hongwei Li,et al.  Blockchain-Assisted Public-Key Encryption with Keyword Search Against Keyword Guessing Attacks for Cloud Storage , 2019, IEEE Transactions on Cloud Computing.

[24]  Pascal Paillier,et al.  Public-Key Cryptosystems Based on Composite Degree Residuosity Classes , 1999, EUROCRYPT.

[25]  Adi Shamir,et al.  How to share a secret , 1979, CACM.

[26]  Iddo Bentov,et al.  How to Use Bitcoin to Design Fair Protocols , 2014, CRYPTO.

[27]  Sofie Pollin,et al.  Distributed Deep Learning Models for Wireless Signal Classification with Low-Cost Spectrum Sensors , 2017, ArXiv.

[28]  Cong Wang,et al.  Searching an Encrypted Cloud Meets Blockchain: A Decentralized, Reliable and Fair Realization , 2018, IEEE INFOCOM 2018 - IEEE Conference on Computer Communications.

[29]  Alptekin Küpçü,et al.  Incentivizing outsourced computation , 2008, NetEcon '08.

[30]  Berry Schoenmakers,et al.  Universally Verifiable Multiparty Computation from Threshold Homomorphic Cryptosystems , 2015, ACNS.

[31]  Payman Mohassel,et al.  SecureML: A System for Scalable Privacy-Preserving Machine Learning , 2017, 2017 IEEE Symposium on Security and Privacy (SP).

[32]  Matthew Green,et al.  Zerocoin: Anonymous Distributed E-Cash from Bitcoin , 2013, 2013 IEEE Symposium on Security and Privacy.

[33]  Paraschos Koutris,et al.  Demonstration of Nimbus: Model-based Pricing for Machine Learning in a Data Marketplace , 2018, SIGMOD Conference.

[34]  Ian J. Goodfellow,et al.  NIPS 2016 Tutorial: Generative Adversarial Networks , 2016, ArXiv.

[35]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[36]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[37]  Marc'Aurelio Ranzato,et al.  Large Scale Distributed Deep Networks , 2012, NIPS.

[38]  Reza Ghaeini,et al.  A Deep Learning Approach for Cancer Detection and Relevant Gene Identification , 2017, PSB.

[39]  Justin Cappos,et al.  CHAINIAC: Proactive Software-Update Transparency via Collectively Signed Skipchains and Verified Builds , 2017, USENIX Security Symposium.

[40]  Pradeep Dubey,et al.  Distributed Deep Learning Using Synchronous Stochastic Gradient Descent , 2016, ArXiv.

[41]  Sheng Zhong,et al.  Privacy-Preserving Backpropagation Neural Network Learning , 2009, IEEE Transactions on Neural Networks.

[42]  Jacques Stern,et al.  Sharing Decryption in the Context of Voting or Lotteries , 2000, Financial Cryptography.

[43]  Anthony K. H. Tung,et al.  SINGA: A Distributed Deep Learning Platform , 2015, ACM Multimedia.

[44]  A. Besir Kurtulmus,et al.  Trustless Machine Learning Contracts; Evaluating and Exchanging Machine Learning Models on the Ethereum Blockchain , 2018, ArXiv.

[45]  Victor Shoup,et al.  Practical Threshold Signatures , 2000, EUROCRYPT.

[46]  Shengen Yan,et al.  Deep Image: Scaling up Image Recognition , 2015, ArXiv.

[47]  Avinatan Hassidim,et al.  Fast quantum byzantine agreement , 2005, STOC '05.

[48]  L. Ohno-Machado,et al.  Identifying inference attacks against healthcare data repositories , 2013, AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science.

[49]  Olatunji Ruwase,et al.  Performance Modeling and Scalability Optimization of Distributed Deep Learning Systems , 2015, KDD.

[50]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[51]  Tribhuvanesh Orekondy,et al.  Understanding and Controlling User Linkability in Decentralized Learning , 2018, ArXiv.

[52]  Samy Bengio,et al.  Revisiting Distributed Synchronous SGD , 2016, ArXiv.

[53]  Jeff Johnson,et al.  Fast Convolutional Nets With fbfft: A GPU Performance Evaluation , 2014, ICLR.

[54]  Jin Li,et al.  Privacy-preserving outsourced classification in cloud computing , 2017, Cluster Computing.

[55]  Scott Shenker,et al.  Epidemic algorithms for replicated database maintenance , 1988, OPSR.

[56]  Yin Zhang,et al.  Incentive-aware routing in DTNs , 2008, 2008 IEEE International Conference on Network Protocols.

[57]  Vitaly Shmatikov,et al.  Inference Attacks Against Collaborative Learning , 2018, ArXiv.

[58]  Sarvar Patel,et al.  Practical Secure Aggregation for Privacy-Preserving Machine Learning , 2017, IACR Cryptol. ePrint Arch..

[59]  Oded Goldreich,et al.  The Foundations of Cryptography - Volume 2: Basic Applications , 2001 .

[60]  Ran Canetti,et al.  Universally composable security: a new paradigm for cryptographic protocols , 2001, Proceedings 2001 IEEE International Conference on Cluster Computing.

[61]  Yann LeCun,et al.  The mnist database of handwritten digits , 2005 .

[62]  Riley Davis Health Insurance Portability and Accountability Act , 2011 .

[63]  Boi Faltings,et al.  An incentive compatible reputation mechanism , 2003, AAMAS '03.

[64]  Trishul M. Chilimbi,et al.  Project Adam: Building an Efficient and Scalable Deep Learning Training System , 2014, OSDI.

[65]  Ivan Damgård,et al.  Efficient Protocols based on Probabilistic Encryption using Composite Degree Residue Classes , 2000, IACR Cryptol. ePrint Arch..

[66]  Silvio Micali,et al.  ALGORAND: The Efficient and Democratic Ledger , 2016, ArXiv.

[67]  Eli Ben-Sasson,et al.  Zerocash: Decentralized Anonymous Payments from Bitcoin , 2014, 2014 IEEE Symposium on Security and Privacy.

[68]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[69]  Elaine Shi,et al.  Hawk: The Blockchain Model of Cryptography and Privacy-Preserving Smart Contracts , 2016, 2016 IEEE Symposium on Security and Privacy (SP).

[70]  Ethan Buchman,et al.  Tendermint: Byzantine Fault Tolerance in the Age of Blockchains , 2016 .

[71]  Satoshi Nakamoto Bitcoin : A Peer-to-Peer Electronic Cash System , 2009 .

[72]  Laurence T. Yang,et al.  Privacy Preserving Deep Computation Model on Cloud for Big Data Feature Learning , 2016, IEEE Transactions on Computers.

[73]  Alptekin Küpçü,et al.  Incentivized Outsourced Computation Resistant to Malicious Contractors , 2017, IEEE Transactions on Dependable and Secure Computing.