Quantization Backdoors to Deep Learning Models

There is currently a burgeoning demand for deploying deep learning (DL) models on ubiquitous edge Internet of Things (IoT) devices attributing to their low latency and high privacy preservation. However, DL models are often large in size and require large-scale computation, which prevents them from being placed directly onto IoT devices where resources are constrained and 32-bit floating-point (float-32) operations are unavailable. Model quantization is a pragmatic solution, which enables DL deployment on mobile devices and embedded systems by effortlessly post-quantizing a large high-precision model (e.g., float-32) into a small low-precision model (e.g., int-8) while retaining the model inference accuracy. However, model quantization needs to be used with great care. This work reveals that the standard quantization operation can be abused to activate a backdoor. We demonstrate that a full-precision backdoored model that does not have any backdoor effect in the presence of a trigger—as the backdoor is dormant— can be activated by the default TensorFlow-Lite (TFLite) quantization, the only product-ready quantization framework to date. In our experiments, we employ three popular model architectures (VGG16, ResNet18, and ResNet50), each trains across three popular datasets: MNIST, CIFAR10 and GTSRB. We ascertain that all nine trained float-32 backdoored models exhibit no backdoor effect even in the presence of trigger inputs. State-ofthe-art frontend detection approaches, such as Neural Cleanse (model-level inspection) and STRIP (data-level inspection), fail to identify the backdoor in the float-32 models. When each of the float-32 models is converted into an int-8 format model through the standard TFLite post-training quantization, the backdoor is activated in the quantized model, which shows a stable attack success rate close to 100% upon inputs with the trigger, while behaves normally upon non-trigger inputs. This work highlights that a stealthy security threat occurs when end users utilize the on-device post-training model quantization toolkits, informing security researchers of cross-platform overhaul of DL models post quantization even if they pass frontend inspections.

[1]  Vitaly Shmatikov,et al.  How To Backdoor Federated Learning , 2018, AISTATS.

[2]  Yu Chen,et al.  Seeing is Not Believing: Camouflage Attacks on Image Scaling Algorithms , 2019, USENIX Security Symposium.

[3]  Ali Farhadi,et al.  XNOR-Net: ImageNet Classification Using Binary Convolutional Neural Networks , 2016, ECCV.

[4]  Tudor Dumitras,et al.  Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks , 2018, NeurIPS.

[5]  Rana Ali Amjad,et al.  Up or Down? Adaptive Rounding for Post-Training Quantization , 2020, ICML.

[6]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[7]  Alex Krizhevsky,et al.  Learning Multiple Layers of Features from Tiny Images , 2009 .

[8]  Ran El-Yaniv,et al.  Binarized Neural Networks , 2016, ArXiv.

[9]  Reza Shokri,et al.  Bypassing Backdoor Detection Algorithms in Deep Learning , 2019, 2020 IEEE European Symposium on Security and Privacy (EuroS&P).

[10]  Yu Li,et al.  DeepDyve: Dynamic Verification for Deep Neural Networks , 2020, CCS.

[11]  Jonathon Shlens,et al.  Explaining and Harnessing Adversarial Examples , 2014, ICLR.

[12]  Yuan Yu,et al.  TensorFlow: A system for large-scale machine learning , 2016, OSDI.

[13]  Luca Benini,et al.  XNOR Neural Engine: A Hardware Accelerator IP for 21.6-fJ/op Binary Neural Network Inference , 2018, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[14]  V. Reddi,et al.  TensorFlow Lite Micro: Embedded Machine Learning on TinyML Systems , 2020, MLSys.

[15]  Dawn Xiaodong Song,et al.  Targeted Backdoor Attacks on Deep Learning Systems Using Data Poisoning , 2017, ArXiv.

[16]  Li Shuangfeng,et al.  TensorFlow Lite: On-Device Machine Learning Framework , 2020 .

[17]  Fnu Suya,et al.  Stealthy Backdoors as Compression Artifacts , 2021, IEEE Transactions on Information Forensics and Security.

[18]  Kwang-Ting Cheng,et al.  ReActNet: Towards Precise Binary Neural Network with Generalized Activation Functions , 2020, ECCV.

[19]  Ting Wang,et al.  Model-Reuse Attacks on Deep Learning Systems , 2018, CCS.

[20]  Ron Banner,et al.  Accurate Post Training Quantization With Small Calibration Sets , 2021, ICML.

[21]  Bo Chen,et al.  Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference , 2017, 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition.

[22]  Wenbo Guo,et al.  TABOR: A Highly Accurate Approach to Inspecting and Restoring Trojan Backdoors in AI Systems , 2019, ArXiv.

[23]  Jishen Zhao,et al.  DeepInspect: A Black-box Trojan Detection and Mitigation Framework for Deep Neural Networks , 2019, IJCAI.

[24]  Zhiru Zhang,et al.  FracBNN: Accurate and FPGA-Efficient Binary Neural Networks with Fractional Activations , 2020, FPGA.

[25]  Ben Y. Zhao,et al.  Neural Cleanse: Identifying and Mitigating Backdoor Attacks in Neural Networks , 2019, 2019 IEEE Symposium on Security and Privacy (SP).

[26]  Georgios Tzimiropoulos,et al.  Training Binary Neural Networks with Real-to-Binary Convolutions , 2020, ICLR.

[27]  Nikita Borisov,et al.  Detecting AI Trojans Using Meta Neural Analysis , 2019, 2021 IEEE Symposium on Security and Privacy (SP).

[28]  Derek Abbott,et al.  RBNN: Memory-Efficient Reconfigurable Deep Binary Neural Network with IP Protection for Internet of Things , 2021, ArXiv.

[29]  Ben Y. Zhao,et al.  Gotta Catch'Em All: Using Honeypots to Catch Adversarial Attacks on Neural Networks , 2019, CCS.

[30]  Andrew Zisserman,et al.  Very Deep Convolutional Networks for Large-Scale Image Recognition , 2014, ICLR.

[31]  Luca Benini,et al.  YodaNN: An Architecture for Ultralow Power Binary-Weight CNN Acceleration , 2016, IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems.

[32]  Brendan Dolan-Gavitt,et al.  BadNets: Identifying Vulnerabilities in the Machine Learning Model Supply Chain , 2017, ArXiv.

[33]  Wen-Chuan Lee,et al.  Trojaning Attack on Neural Networks , 2018, NDSS.

[34]  Haixu Tang,et al.  Demon in the Variant: Statistical Analysis of DNNs for Robust Backdoor Contamination Detection , 2019, USENIX Security Symposium.

[35]  Xiangyu Zhang,et al.  ABS: Scanning Neural Networks for Back-doors by Artificial Brain Stimulation , 2019, CCS.

[36]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[37]  Johannes Stallkamp,et al.  Man vs. computer: Benchmarking machine learning algorithms for traffic sign recognition , 2012, Neural Networks.

[38]  Georgios Tzimiropoulos,et al.  XNOR-Net++: Improved binary neural networks , 2019, BMVC.

[39]  Nicolas Papernot,et al.  Entangled Watermarks as a Defense against Model Extraction , 2020, USENIX Security Symposium.

[40]  Yingyu Liang,et al.  Can Adversarial Weight Perturbations Inject Neural Backdoors , 2020, CIKM.

[41]  Pete Warden,et al.  TinyML: Machine Learning with TensorFlow Lite on Arduino and Ultra-Low-Power Microcontrollers , 2019 .

[42]  Bao Gia Doan,et al.  Backdoor Attacks and Countermeasures on Deep Learning: A Comprehensive Review , 2020, ArXiv.

[43]  Nicu Sebe,et al.  Binary Neural Networks: A Survey , 2020, Pattern Recognit..

[44]  Benny Pinkas,et al.  Turning Your Weakness Into a Strength: Watermarking Deep Neural Networks by Backdooring , 2018, USENIX Security Symposium.

[45]  Bao Gia Doan,et al.  Februus: Input Purification Defence Against Trojan Attacks on Deep Neural Network Systems , 2019, 1908.03369.

[46]  Damith Chinthana Ranasinghe,et al.  STRIP: a defence against trojan attacks on deep neural networks , 2019, ACSAC.