Private Model Compression via Knowledge Distillation

The soaring demand for intelligent mobile applications calls for deploying powerful deep neural networks (DNNs) on mobile devices. However, the outstanding performance of DNNs notoriously relies on increasingly complex models, which in turn is associated with an increase in computational expense far surpassing mobile devices' capacity. What is worse, app service providers need to collect and utilize a large volume of users' data, which contain sensitive information, to build the sophisticated DNN models. Directly deploying these models on public mobile devices presents prohibitive privacy risk. To benefit from the on-device deep learning without the capacity and privacy concerns, we design a private model compression framework RONA. Following the knowledge distillation paradigm, we jointly use hint learning, distillation learning, and self learning to train a compact and fast neural network. The knowledge distilled from the cumbersome model is adaptively bounded and carefully perturbed to enforce differential privacy. We further propose an elegant query sample selection method to reduce the number of queries and control the privacy loss. A series of empirical evaluations as well as the implementation on an Android mobile device show that RONA can not only compress cumbersome models efficiently but also provide a strong privacy guarantee. For example, on SVHN, when a meaningful $(9.83,10^{-6})$-differential privacy is guaranteed, the compact model trained by RONA can obtain 20$\times$ compression ratio and 19$\times$ speed-up with merely 0.97% accuracy loss.

[1]  Witawas Srisa-an,et al.  DroidClassifier: Efficient Adaptive Mining of Application-Layer Header for Classifying Android Malware , 2016, SecureComm.

[2]  Jieping Ye,et al.  Querying discriminative and representative samples for batch mode active learning , 2013, KDD.

[3]  Boi Faltings,et al.  Generating Artificial Data for Private Deep Learning , 2018, 1803.03148.

[4]  Rich Caruana,et al.  Model compression , 2006, KDD '06.

[5]  Philip S. Yu,et al.  Contaminant removal for Android malware detection systems , 2017, 2017 IEEE International Conference on Big Data (Big Data).

[6]  Ruimao Zhang,et al.  Cost-Effective Active Learning for Deep Image Classification , 2017, IEEE Transactions on Circuits and Systems for Video Technology.

[7]  Bo Chen,et al.  MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications , 2017, ArXiv.

[8]  Lei Zheng,et al.  DeepMood: Modeling Mobile Phone Typing Dynamics for Mood Detection , 2017, KDD.

[9]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[10]  Aaron Roth,et al.  The Algorithmic Foundations of Differential Privacy , 2014, Found. Trends Theor. Comput. Sci..

[11]  Brecht Claerhout,et al.  Privacy protection for clinical and genomic data: The use of privacy-enhancing techniques in medicine , 2005, Int. J. Medical Informatics.

[12]  Yoshua Bengio,et al.  FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[13]  Song Han,et al.  Deep Compression: Compressing Deep Neural Network with Pruning, Trained Quantization and Huffman Coding , 2015, ICLR.

[14]  Andrew Y. Ng,et al.  Reading Digits in Natural Images with Unsupervised Feature Learning , 2011 .

[15]  Il-Chul Moon,et al.  Adversarial Dropout for Supervised and Semi-supervised Learning , 2017, AAAI.

[16]  Boi Faltings,et al.  Generating Differentially Private Datasets Using GANs , 2018, ArXiv.

[17]  Philip S. Yu,et al.  Not Just Privacy: Improving Performance of Private Deep Learning in Mobile Cloud , 2018, KDD.

[18]  Giuseppe Ateniese,et al.  Deep Models Under the GAN: Information Leakage from Collaborative Deep Learning , 2017, CCS.

[19]  Rajmohan Rajaraman,et al.  Analysis of a local search heuristic for facility location problems , 2000, SODA '98.

[20]  Timo Aila,et al.  Temporal Ensembling for Semi-Supervised Learning , 2016, ICLR.

[21]  Cynthia Dwork,et al.  Differential Privacy , 2006, ICALP.

[22]  Hamed Haddadi,et al.  A Hybrid Deep Learning Architecture for Privacy-Preserving Mobile Analytics , 2017, IEEE Internet of Things Journal.

[23]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[24]  Úlfar Erlingsson,et al.  Scalable Private Learning with PATE , 2018, ICLR.

[25]  Tony X. Han,et al.  Learning Efficient Object Detection Models with Knowledge Distillation , 2017, NIPS.

[26]  Geoffrey E. Hinton,et al.  Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[27]  Ian Goodfellow,et al.  Deep Learning with Differential Privacy , 2016, CCS.

[28]  Martín Abadi,et al.  Semi-supervised Knowledge Transfer for Deep Learning from Private Training Data , 2016, ICLR.

[29]  Amos Beimel,et al.  Bounds on the sample complexity for private learning and private data release , 2010, Machine Learning.