论文信息 - Undistillable: Making A Nasty Teacher That CANNOT teach students

Undistillable: Making A Nasty Teacher That CANNOT teach students

Knowledge Distillation (KD) is a widely used technique to transfer knowledge from pre-trained teacher models to (usually more lightweight) student models. However, in certain situations, this technique is more of a curse than a blessing. For instance, KD poses a potential risk of exposing intellectual properties (IPs): even if a trained machine learning model is released in “black boxes” (e.g., as executable software or APIs without open-sourcing code), it can still be replicated by KD through imitating input-output behaviors. To prevent this unwanted effect of KD, this paper introduces and investigates a concept called Nasty Teacher: a specially trained teacher network that yields nearly the same performance as a normal one, but would significantly degrade the performance of student models learned by imitating it. We propose a simple yet effective algorithm to build the nasty teacher, called self-undermining knowledge distillation. Specifically, we aim to maximize the difference between the output of the nasty teacher and a normal pretrained network. Extensive experiments on several datasets demonstrate that our method is effective on both standard KD and data-free KD, providing the desirable KD-immunity to model owners for the first time. We hope our preliminary study can draw more awareness and interest in this new practical problem of both social and legal importance. Our codes and pre-trained models can be found at https://github.com/VITA-Group/Nasty-Teacher.

[1] Geoffrey E. Hinton,et al. Distilling the Knowledge in a Neural Network , 2015, ArXiv.

[2] Ke Chen,et al. Structured Knowledge Distillation for Semantic Segmentation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[3] Thad Starner,et al. Data-Free Knowledge Distillation for Deep Neural Networks , 2017, ArXiv.

[4] Jie Zhang,et al. Passport-aware Normalization for Deep Model Protection , 2020, NeurIPS.

[5] Jiashi Feng,et al. Revisit Knowledge Distillation: a Teacher-free Framework , 2019, ArXiv.

[6] Haoyu Ma,et al. Good Students Play Big Lottery Better , 2021, ArXiv.

[7] Yoichi Sato,et al. Privacy-Preserving Visual Learning Using Doubly Permuted Homomorphic Encryption , 2017, 2017 IEEE International Conference on Computer Vision (ICCV).

[8] Lixin Fan,et al. Rethinking Deep Neural Network Ownership Verification: Embedding Passports to Defeat Ambiguity Attacks , 2019, NeurIPS.

[9] Xiangyu Zhang,et al. ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design , 2018, ECCV.

[10] Han Fang,et al. Model Watermarking for Image Processing Networks , 2020, AAAI.

[11] Zhenyu Wu,et al. Towards Privacy-Preserving Visual Recognition via Adversarial Training: A Pilot Study , 2018, ECCV.

[12] Seyed-Mohsen Moosavi-Dezfooli,et al. DeepFool: A Simple and Accurate Method to Fool Deep Neural Networks , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[13] Yan Lu,et al. Relational Knowledge Distillation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[14] Zhangyang Wang,et al. Long Live the Lottery: The Existence of Winning Tickets in Lifelong Learning , 2021, ICLR.

[15] Chao Xu,et al. Distilling portable Generative Adversarial Networks for Image Translation , 2020, AAAI.

[16] Neil D. Lawrence,et al. Variational Information Distillation for Knowledge Transfer , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[17] Xinghao Chen,et al. Optical Flow Distillation: Towards Efficient and Stable Video Style Transfer , 2020, ECCV.

[18] Seyed Iman Mirzadeh,et al. Improved Knowledge Distillation via Teacher Assistant , 2020, AAAI.

[19] Anastasios Tefas,et al. Learning Deep Representations with Probabilistic Knowledge Transfer , 2018, ECCV.

[20] Mark Sandler,et al. MobileNetV2: Inverted Residuals and Linear Bottlenecks , 2018, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[21] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[22] Dawn Xiaodong Song,et al. Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[23] Nikos Komodakis,et al. Paying More Attention to Attention: Improving the Performance of Convolutional Neural Networks via Attention Transfer , 2016, ICLR.

[24] Tribhuvanesh Orekondy,et al. Prediction Poisoning: Towards Defenses Against DNN Model Stealing Attacks , 2020, ICLR.

[25] Brendan Dolan-Gavitt,et al. BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[26] Yoshua Bengio,et al. FitNets: Hints for Thin Deep Nets , 2014, ICLR.

[27] Shiyu Chang,et al. Robust Overfitting may be mitigated by properly learned smoothening , 2021, ICLR.

[28] Derek Hoiem,et al. Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[29] Kaisheng Ma,et al. Be Your Own Teacher: Improve the Performance of Convolutional Neural Networks via Self Distillation , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[30] Jinwoo Shin,et al. Regularizing Class-Wise Predictions via Self-Knowledge Distillation , 2020, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[31] Claudia Eckert,et al. Is Feature Selection Secure against Training Data Poisoning? , 2015, ICML.

[32] Shin'ichi Satoh,et al. Embedding Watermarks into Deep Neural Networks , 2017, ICMR.

[33] Jiashi Feng,et al. Distilling Object Detectors With Fine-Grained Feature Imitation , 2019, 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).

[34] Samuel Marchal,et al. PRADA: Protecting Against DNN Model Stealing Attacks , 2018, 2019 IEEE European Symposium on Security and Privacy (EuroS&P).

[35] Jiashi Feng,et al. Residual Distillation: Towards Portable Deep Neural Networks without Shortcuts , 2020, NeurIPS.

[36] Jian Sun,et al. Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[37] Qi Tian,et al. Data-Free Learning of Student Networks , 2019, 2019 IEEE/CVF International Conference on Computer Vision (ICCV).

[38] Graham Neubig,et al. Weight Poisoning Attacks on Pretrained Models , 2020, ACL.

[39] Moinuddin K. Qureshi,et al. Defending Against Model Stealing Attacks With Adaptive Misinformation , 2019, 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR).