Crowdlearning: Crowded Deep Learning with Data Privacy

Deep Learning has shown promising performance in a variety of pattern recognition tasks owning to large quantities of training data and complex structures of neural networks. However conventional deep neural network (DNN) training involves centrally collecting and storing the training data, and then centrally training the neural network, which raises much privacy concerns for the data producers. In this paper, we study how to enable deep learning without disclosing individual data to the DNN trainer. We analyze the risks in conventional deep learning training, then propose a novel idea - Crowdlearning, which decentralizes the heavy- load training procedure and deploys the training into a crowd of computation-restricted mobile devices who generate the training data. Finally, we propose SliceNet, which ensures mobile devices can afford the computation cost and simultaneously minimize the total communication cost. The combination of Crowdlearning and SliceNet ensures the sensitive data generated by mobile devices never leave the devices, and the training procedure will hardly disclose any inferable contents. We numerically simulate our prototype of SliceNet which crowdlearns an accurate DNN for image classification, and demonstrate the high performance, acceptable calculation and communication cost, satisfiable privacy protection, and preferable convergence rate, on the benchmark DNN structure and dataset.

[1]  Xiang-Yang Li,et al.  Graph-based privacy-preserving data publication , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[2]  Michael Naehrig,et al.  ML Confidential: Machine Learning on Encrypted Data , 2012, ICISC.

[3]  Anand D. Sarwate,et al.  Differentially Private Empirical Risk Minimization , 2009, J. Mach. Learn. Res..

[4]  Louis J. M. Aslett,et al.  Encrypted statistical machine learning: new privacy preserving methods , 2015, ArXiv.

[5]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[6]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[7]  Blaise Agüera y Arcas,et al.  Federated Learning of Deep Networks using Model Averaging , 2016, ArXiv.

[8]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[9]  James Philbin,et al.  FaceNet: A unified embedding for face recognition and clustering , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[10]  Fan Ye,et al.  Mobile crowdsensing: current state and future challenges , 2011, IEEE Communications Magazine.

[11]  Michael Naehrig,et al.  CryptoNets: applying neural networks to encrypted data with high throughput and accuracy , 2016, ICML 2016.

[12]  Anthony K. H. Tung,et al.  SINGA: A Distributed Deep Learning Platform , 2015, ACM Multimedia.

[13]  Bhiksha Raj,et al.  Multiparty Differential Privacy via Aggregation of Locally Trained Classifiers , 2010, NIPS.

[14]  Vitaly Shmatikov,et al.  Privacy-preserving deep learning , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).

[15]  Amit Sahai,et al.  Secure Multi-Party Computation , 2013 .

[16]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[17]  Geoffrey E. Hinton,et al.  Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[18]  Alexander J. Smola,et al.  Efficient mini-batch training for stochastic optimization , 2014, KDD.

[19]  David Chaum,et al.  Untraceable electronic mail, return addresses, and digital pseudonyms , 1981, CACM.

[20]  Shaojie Tang,et al.  Privacy-Preserving Selective Aggregation of Online User Behavior Data , 2017, IEEE Transactions on Computers.

[21]  Xiang-Yang Li,et al.  De-anonymizing social networks and inferring private attributes using knowledge graphs , 2016, IEEE INFOCOM 2016 - The 35th Annual IEEE International Conference on Computer Communications.

[22]  Yehuda Lindell,et al.  Secure Multiparty Computation for Privacy-Preserving Data Mining , 2009, IACR Cryptol. ePrint Arch..

[23]  Linlin Chen,et al.  Social Network De-Anonymization and Privacy Inference with Knowledge Graph Model , 2019, IEEE Transactions on Dependable and Secure Computing.

[24]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[25]  Marina Blanton,et al.  Secure Multiparty Computation , 2011, Encyclopedia of Cryptography and Security.

[26]  Brendan J. Frey,et al.  Deep learning of the tissue-regulated splicing code , 2014, Bioinform..

[27]  Vipin Kumar,et al.  Multilevel Algorithms for Multi-Constraint Graph Partitioning , 1998, Proceedings of the IEEE/ACM SC98 Conference.